Connecting data from multiple sources creates new ways of looking at creating business decisions from a wealth of information. Here are a few examples of how data science creates insights.
The Chicago Department of Public Health conducts thousands of restaurant inspections annually, and understanding the patterns in these inspections can be a game-changer for small businesses. Between 2010 to 2017, Chicago's restaurants faced a challenging problem: 18% of them received failing inspection outcomes.
Data analysis identified the significance of pest control violations and their connection to weather and location. Further, the application of a classification and regression tree (CART) model underscored the power of data in classifying the likelihood of businesses failing future inspections based on current outcomes, requiring only five key violations for precise predictions.
This study serves as a universal use case for the potential of data in decision-making. By connecting data from multiple sources, businesses can uncover insights, understand challenges, and make informed choices to achieve their goals. At Datayz, we're committed to sharing this knowledge and helping businesses harness data's transformative capabilities for success.
The pharmaceutical industry plays a pivotal role in public health, yet the costs and complexities involved in bringing new drugs to market are staggering. In the United States' ever-evolving landscape, the importance of predicting generic competition cannot be overstated. Branded drugs require extensive safety and efficacy studies, costing billions of dollars and over a decade of research. In contrast, generic drugs, which can be released after patent expiry, entail significantly lower development costs. The two approaches exist through a legislative framework that balances innovation against affordability. The erosion of drug prices through generic competition is a well-documented consequence, and this case study seeks to understand if certain drugs are ripe for disruption.
The study leveraged data from the FDA's "Orange Book," a comprehensive record of approved drug products and their patent status. It also tapped into Structured Product Labels, which contain detailed information about pharmaceutical products. The analysis focused on features such as patent counts, the gap between brand approval and patent expiry, dosage forms, ingredients, and packaging components. Machine learning techniques were employed to predict generic competition based on these variables.
The analysis revealed compelling insights. Products approved well before the first patent expiry were more likely to face early generic competition. Tablet dosage forms were more susceptible to early competition, while liquid or injectable forms suggested a delay in generic competition. Further, the presence or absence of specific ingredients correlated with competition.
While the models used have limitations, including market dynamics and uncertainty about new competitors, they provide a foundation for synthesizing data from the complex interplay among patents, ingredients, and dosage forms.