Data Gurus


Data Quality, Part 3 of 4 with Bill Reinstein, CEO of MedData Group | Ep. 86

In this continuation of a special series on data quality, you’ll hear different perspectives from other industries regarding market research and the analytics industry.

This episode, Sima chats with Charlie Allieri, CEO of Imperium, and the sponsor of this series on data quality, and Bill Reinstein, who is CEO and CTO of MedData Group.

Bill is a seasoned entrepreneur with over 20 years of expertise building interactive media and technology-enabled service organizations. Bill has built MedData Group into the leading provider of healthcare professional data for digital marketing

From Direct Email Marketer to Aggregator of HCP Data

MedData Group offers extensive demographics, firmographics, clinical behavior, and full contact data including email and digital IDs to support multi-channel marketing outreach. It also creates custom programs for vendors seeking program registrations, leads and/or web site traffic from its databases of over 4.2 million healthcare professionals.

Originally Bill’s company had a product that was focused on email direct marketing for the healthcare segment, specifically targeting healthcare professionals. As noted by Bill, HCP’s include physicians, nurse practitioners, physician assistants, and a wide range of allied health professionals, such as psychologists and chiropractors.

His company evolved into a data-licenser and data provider extending beyond the performance-based marketing business. 

A few years ago, MedData Group became one of the largest aggregators of healthcare professional data and they realized they had an opportunity in the digital marketing (also known as “omnichannel”) space, leveraging that very large scale of offline data. Through some of their proprietary processes, they were able to link that data to what are called online identities, in a privacy-safe way.

MedData Group’s customers are primarily pharmaceutical brands and their associated advertising agencies, and they are able to target digital advertising through different channels.

Public Domain Data

The large volume of MedData Group’s data is in the public domain, or what is better known as the “quasi-public domain”.

Healthcare differs from other professional segments because of the nature of the services that they provide. Every healthcare professional has to be licensed at both the state level and at the federal level, and they all have a common identifier (also known as a “key”), and that’s the National Provider Identifier or NPI. This identifier remains with them through the bulk of their career. 

Due to certain types of federal legislation such as the Sunshine Act, and in terms of transparency of what drugs the HCP’s provide and what diagnoses they’ve made when it’s related to federal reimbursement programs, for example, this information is out there. It’s not necessarily easy to find nor is it easy to aggregate, but that’s not what makes this data valuable.

Rather, it’s the value-add that the company provides that makes the aggregation of structure and normalization valuable.

Measuring Data Quality

It’s necessary to bifurcate and look at the data in different buckets, Bill explains.

At the top level, the dividing line is whether it’s offline data or online data. 

Starting in the offline space, there are around 100 input sources of data from every state-licensed database and multiple federal-level databases of information, as well as commercial data that MedData Group licenses that come from medical claims data. This medical claims data is at the patient level but is HIPAA protected in a non-identifying kind of way. 

On the front end, it starts with the various ETL (extract, transform, and load) processes that the data management team examines by automated scripting in the platforms that they use. The scripting looks for anomalies and then moves to more demographic data. In addition to automated review, there is a team of researchers that are responsible for quality checking some of this data for accuracy.

In the offline space, there are many levels of data accuracy and hygiene.

The online space involves something called “identity resolution”, which attempts to correctly link an offline person to the correct online identity. There are different methodologies used on the back end to measure and analyze this.

Delivering the Data

In the case of email licensing or managed service email, that data is hygiene over the course of a month, relative to its linkage to offline identity and offline professional data. It is managed in almost real-time relative to the deliverability or the “hygiening” of that email data. 

Transparency in the Processes

A large volume of MedData Group’s business is in omnichannel digital and processes are quite complex. Although some of these processes they’ve developed are proprietary, they strive as a company to be transparent with clients in explaining to them how it all works and why they believe it’s superior and more accurate. 

They want the client to understand the processes and want the client to ask other providers about their own processes. This will provide the client with the knowledge to make an informed decision.

MedData Group was recently acquired and the company they’re now a part of is a great fit because they are also mindful of high-level premium data products. 

The data, the data quality, and the accuracy of its use from a data-stewardship and compliance perspective were micro-level discussion points in both the pre-agreement stage and in due diligence. 


Sima loves to hear from her listeners with input, questions, suggestions and just to connect! You can find her at the links below!



Sima is passionate about data and loves to share, learn and help others that share that passion. If you love data as much as her, subscribe on iTunes and don’t forget to leave a rating and review!

This 4 part series is sponsored by Imperium. We are grateful for their support to bring you this important series.
Connect with Bill Reinstein: