With a growing adoption of machine learning (ML) solutions across different sectors, it is critical to ensure its positive societal impact. In this work, I will share key practices including the role of domain experts in the process and data valuation for potential systematic deviations to achieve trustworthy ML. Similarly, I reflect on our recent works across climate/sustainability and healthcare including automated identification and characterization of systematic deviations for various tasks, including data quality understanding, temporal drift, treatment effects analysis, and new class detection. Furthermore, I will also share how our generalized approach helps to evaluate capabilities of generative models in domain-agnostic and interpretable ways. I argue that similar data-centric analysis should also extend to traditional data sources beyond curated ML datasets.