Job Description
About The Role:
Factset is looking for leader who leads Data Mining Team to increase company ability to automate with wide variety of sources of data. He/She has to collaborate with Data Lake team and contribute in building the pipeline by enhancing or enriching the data using machine learning techniques.
Manage a science agenda that balances short term deliverable's with measurable business impact with long term projects. Proficient with Statistical analysis, standard machine learning techniques and ML model deployment engineering best practices.
- Work with data scientists, engineers, and cross functional teams to produce end-to-end production-ready solutions
- Drive a culture of quality, performance, scalability, and reliability
Total Experience: 7 to 9 years
- Must have
- Computer skills
- Practice of programming in Python
- Comfortable working in Linux and windows environment
- Practice of source control, code review, testing frameworks
- Practice of Big Data frameworks, like Hadoop, Spark
- Knowledge of both Sql and noSql databases (Columns, Documents, Key-Value)
- Cloud
- Basic knowledge of cloud environment, constraints and opportunities
- Statistics
- Basic statistics knowledge, probability, correlation, regression, linear algebra, stochastic process
- Practice of data analysis, data cleanup, data investigation, how to detect and handle unbalance, bias and noise
- Proficient in data structures and algorithms, in particular: lists, queues, trees, graphs, and sorting, searching, traversing, dynamic programming, and map-reduce pattern
- Knowledge and practice of ML and NN algorithms and frameworks
- Knowledge and practice of Natural Language Processing
- Others
- Practice of data mining and data science projects, from understanding the problem to presenting results and deploying in production
- Communication skills, project presentation, capacity to popularize technics and results
- Knowledge of Data Mining / Data Science projects and publications
- Nice to have
- Computer skills
- Practice of programming in other languages like R, java, Julia, Matlab
- Practice working on Jupyter notebook/lab or VS Code
- Practice working on Git/Github
- Cloud
- Practice on AWS cloud / SageMaker