Senior Director, Clinical Informatics & Health Data Science
New Today
Senior Director, Clinical Informatics and Health Data Sciences
The Regeneron Genetics Center uses genetics and health data on millions of people to advance our understanding of human disease and guide Regeneron’s therapeutic programs. You will lead our efforts to organize health information aggregated from electronic health records, surveys, digital devices, and laboratory assays from multiple collaborators. This will require supervising and mentoring a small high-performing team and working with them to develop and implement standards for health data as well as code repositories, pipelines and analysis tools that facilitate interacting with data. This will also require coordination with an extensive team of stakeholders and users of this health data.
You will need expertise in organizing and structuring large and diverse datasets as well as top-notch software engineering and orchestration skills. You will be responsible for the design of data structures that store health data. You will architect the code and processes that populate, validate, and analyze these data. You will participate in downstream analysis using machine learning, genomics, and epidemiology. This will be a dynamic, challenging position with lots of work and lots of opportunities.
A typical day might include:
Managing priorities and schedules for your team, as your work to match their abilities and expertise with the needs and goals of the organization. You will have to prioritize among a variety of tasks – ranging from datasets that might need data curation and harmonization to potential improvements to code, processes and APIs that might improve long-term team productivity.
Oversee the design and implementation of a set of interactive tools and APIs that facilitate access to health datasets, including through the calculation of basic summary statistics and informative graphics, the calculation of quality metrics and reports, and the ability for users to programmatically query and retrieve data.
Identifying courses, projects and resources that facilitate the professional development and growth of your team members, with the goal of growing their knowledge of clinical informatics, computer science, data engineering, epidemiology and collaboration.
Reviewing the structure, content, and quality of phenotype data extracted from electronic health records, surveys, digital devices , or laboratory assays. Each of these datasets may include data on 100,000s of people and require coordination and input from multiple stakeholders with varied expertise.
Architect, develop and test the tools and code that will transform electronic health records, surveys, laboratory assays, or digital device data into a harmonized format compatible with RGC analytical tools, applications and processes. You and your team will probably be writing and updating code in Python and using associated data science libraries, such as pandas, Polars, NumPy, scikit, and others.
You will be presenting results and summaries of these datasets and data processing plans to a variety of technical audiences, ranging from experts in statistics, epidemiology, genetics, and computation to experts in biology, drug design, and medicine. You will need outstanding communication skills and an ability to summarize and present to a variety of technical audiences.
You will work in a highly interactive environment with a diverse team of colleagues. We highly value the ability to interact, learn, and teach so that you and other skilled individuals consistently achieve high levels of motivation, enthusiasm, and performance.
This role might be for you if:
An outstanding candidate will have an advanced degree in Computer Science, Health Informatics, Clinical Informatics, Biostatistics, or a related field. 10+ years of experience in organizing and processing rich datasets is expected as well as demonstrated experience in managing, growing and developing junior team members.
A demonstrated knowledge of Python and key data science libraries is a must. Knowledge of R, SQL and/or C/C++ is also highly valued. If you have contributed to code in GitHub or another public repository, let us know.
Have a passion for teaching others and helping your team grow. We expect our people managers to make their teams better!
Understanding strategies for mapping structured and unstructured data to ontologies such as ICD-10, RxNORM and LOINC.
Knowledge and experience applying best practices for data quality control, summarization and visualization.
A passion for learning. We are a fast-moving team in a fast-moving company. You should expect to encounter challenging work and to learn many new skills.
Ability to summarize and distil results concisely. Excellent oral presentation and writing skills.
To be considered for this position you should have demonstrated experience in management of health-related data, in application design and development, and as a people manager. This position will be based in Westchester County, New York.
- Location:
- Tarrytown