Advertising/Marketing/Public Relations, Engineering, Information Technology, Sales, Quality Control, Research & Development, Writing/Authoring
What Will You Be A Part Of?
When you are part of the team at Thermo Fisher Scientific, you’ll do important work, like helping customers in finding cures for cancer, protecting the environment or making sure our food is safe. Your work will have real-world impact, and you’ll be supported in achieving your career goals.
How Will You Make An Impact?
Thermo Fishers Scientific is seeking a Data Engineer located at Carlsbad, CA to work with Data Science and Data Architecture team to build Databricks-based Data Pipeline and bring data onto our enterprise-level data platform for Data Science, Analytics and Digital Marketing needs. The data platform is primarily based on Oracle Exadata database, AWS Redshift and Databricks-based Delta technologies toward Lakehouse transition to enable Data Science, Data Analytics, Customer Analytics and Data Services for critical Application and Business enablement
What Will You Do?
Design, develop, test, deploy, support, enhance data integration solutions seamlessly to connect and integrate Thermo Fisher enterprise systems in our Data Science and Enterprise Data Platform.
Innovate for data integration in Apache Spark-based Platform to ensure the technology solutions leverage cutting edge integration capabilities.
Facilitate requirements gathering and process mapping workshops, review business/functional requirement documents, author technical design documents, testing plans and scripts.
Assist with implementing standard operating procedures, facilitate review sessions with functional owners and end-user representatives, and leverage technical knowledge and expertise to drive improvements.
Defining, designing and documenting reference architecture and leading the implementation of BI and analytical solutions.
Follow agile development methodologies to deliver solutions and product features by following DevOps practices.
How Will You Get Here?
4-year degree with a major in computer science engineering (or equivalent) from an accredited university (preferred) will substitute for a minimum of 3-5 years professional IT experience
Experience, Knowledge, Skills, Abilities
Experience in Databricks, Data/Delta lake, Oracle, or AWS Redshift type relational databases.
Extensive experience in Databricks/Spark-based Data Engineering Pipeline development
3+ years working experience in Python-based data integration and pipeline development.
Data lake and Delta lake experience with AWS Glue and Athena.
2+ years of Experience with AWS Cloud on data integration with Apache Spark, Glue, Kafka, Elastic Search, Lambda, S3, Redshift, RDS, MongoDB/DynamoDB ecosystems.
Strong real-life experience in python development especially in pySpark in AWS Cloud environment
Design, develop test, deploy, maintain and improve data integration pipeline.
Experience in Python and common python libraries.
Strong analytical experience with database in writing complex queries, query optimization, debugging, user defined functions, views, indexes etc.
Strong experience with source control systems such as Git and Jenkins build and continuous integration tools.
Highly self-driven, execution-focused, with a willingness to do "what it takes” to deliver results as you will be expected to rapidly cover a considerable amount of demands on data integration
Understanding of development methodology and actual experience writing functional and technical design specifications.
Excellent verbal and written communication skills, in person, by telephone, and with large teams.
Strong prior technical, development background in either data Services or Engineering
Demonstrated experience resolving complex data integration problems;
Must be able to work cross-functionally. Above all else, must be equal parts data-driven and results-driven.
Location: Carlsbad, CA preferred. Open to other remote locations.