At Human API a LexisNexis Risk Solutions Company, we exist to radically accelerate the pace of health innovation for everyone. That is our vision. We unlock siloed health data from everywhere and put it directly into the hands of the consumer, right when they need it: enabling a meaningful transaction that creates value in their lives. Some call us the “PayPal” of health data, but our dreams are bigger.
To name a few examples, we help rapidly screen participants for clinical trials, enable life-changing software through accurate wearable data, and we accelerate the buying process for important insurance products. Tomorrow we will inspire the innovation of impactful digital health products to maximize human longevity and potential. We are looking for talent who are equally inspired by these big ideas and have the grit and determination to make them come to life.
We’re looking for people who are good at engineering systems to manipulate, process, and make sense of data. If you’re an expert in some areas of data engineering, but not others, we’ll train you up. If you have the knowledge and skills to architect and design systems, we’ll let you do that and follow your lead.
The core of this role is taking complex data and making it accessible for others.
What you’ll do
- You’ll build out our data platform capabilities across a wide selection of themes (pipelines, graphs, ml models, annotations, automation, data observability, data lifecycle management, data governance, semantic layers) using a set of core services (currently AWS, Databricks, and Looker)
- Manage and scale data pipelines across the terabytes of arbitrarily complex data in our data lake, both structured and unstructured
- Performance tuning and scaling internal and external real-time data services
- At least 5+ years working as an individual technical contributor, in a software engineering role
- 3+ years working with distributed data systems and data engineering
- Streaming processing semantics (and debugging streaming applications)
- Design and maintenance of pipelines built atop distributed data technologies (Databricks/Spark, Snowflake, BigQuery/Redshift, Kafka/Kinesis, etc.)
- All aspects of managing distributed systems (design, scalability, security, deployment, observability, resiliency)
- CI/CD principles (even better applying devops practices to data)
- Workflow orchestration (Airflow, Prefect, AWS Step Functions, etc.)
- SQL and Python, and willingness to learn other languages
- Data modeling and query optimization
Advanced proficiency in:
Qualities that will help you succeed
- Keen eye for opportunities to better leverage time (both your own and others)
- Desire to take complex problems and making them simple for others
- A desire to help others grow
- A commitment to continuous self improvement
- An ability to operate both individually and collaboratively
- Experience with analytics engineering (working with dbt, Looker)
- Experience with machine learning engineering (and NLP problems like entity recognition/resolution)
- Experience with privacy engineering