Staff Data Scientist
JobTitle: Staff Data Scientist
@WalmartLabs is the technical powerhouse behind Walmart Global eCommerce. We employ big data at scale -- from machine learning, data mining and optimization algorithms, to modeling and analyzing massive flows of data from online, social, mobile and offline commerce. We don’t just engineer cool websites, mobile apps, and new services; we use our own open source tools to create the framework. Deployment is automated and accelerated through our open cloud platform. This makes us incredibly nimble and able to adjust in real-time to our global customers.
The specific position is the \"Perceive\" team within search, led by Somnath Banerjee.
Perceive is critical in capturing and understanding customer inputs. The team size is currently 1 tech lead and 2 data scientists (may be 1 if Thanh leaves) and 3 engineers. This team is responsible 7-8 major technologies impacting majority of our traffic. The team owns Type Ahead, Spell Check, LHN, Query Categorization, QUS, Entity Search – Known Queries, Entity Search – Generated Queries, Query rewrites and normalization.
- A Staff Data Scientist is responsible for analyzing large data sets to develop multiple custom models and algorithms to drive innovative business solutions. Staff Data Scientists work on large project teams in order to provide analytical support and guidance to an assigned area on for large projects (for example, email targeting, business optimization, consumer recommendations) within Walmart eCommerce. Staff Data Scientists are responsible for building large data sets from multiple sources in order to build algorithms for predicting future data characteristics. Those algorithms will be tested, validated, and applied to large data sets. Staff Data Scientists are responsible for training the algorithms so they can be applied to future data sets and provide the appropriate search results. Staff Data Scientists are responsible for researching new trends in the industry and utilizing up-to-date technology (for example, HBase, MapReduce, LAPack, Gurobi) and analytical skills to support their assigned project. Staff Data Scientists are the subject matter experts for statistical analysis and modeling for their project team.
- Build complex data sets from multiple data sources, both internally and externally.
- Build learning systems to analyze and filter continuous data flows and offline data analysis.
- Collaborate with cross-functional partners across the business.
- Collaborate with project teams to implement data modeling solutions.
- Combine data features to determine search models.
- Conduct advanced statistical analysis to determine trends and significant data relationships.
- Develop models of current state in order to determine improvements needed.
- Develop multiple custom data models to drive innovative business solutions.
- Drives the execution of multiple business plans and projects
- Ensures business needs are being met
- Interpret data to identify trends to go across future data sets.
- Promotes and supports company policies, procedures, mission, values, and standards of ethics and integrity
- Provides supervision and development opportunities for associates
- Research new techniques and best practices within the industry.
- Scale new algorithms to large data sets.
- Train algorithms to apply models to new data sets.
- Translate business needs into data requirements.
- Utilize system tools including (MySQL, Hadoop, Weka, R, Matlab,ILog).
- Validate models and algorithmic techniques.
- PhD in computer science or similar field or MS with at least 2-5 years of related experience
- Deep knowledge of machine learning, information retrieval, data mining, statistics, NLP or related field.
- Good functional coding skills in C++ or Java(Java is highly preferred) – talent must be capable of spending up to 10% daily work day in writing production code in either C++/Java/Hadoop/Hive
- Expert level knowledge of one of the scripting languages such as Python or Perl.
- Superior ability to analyze and interpret the results of product experiments.
- Proven experience working with statistical languages such as R.
- Experience working with large data sets and distributed computing tools a plus (Map/Reduce, Hadoop, Hive, Spark etc.)
- Strong communication skills both written and verbal
Additional Preferred Qualifications
- Knowledge of Spark, Scikit-learn, Problem solving, Willing to learn new technologies.
- Self starter, Quick learner, Keen observer, eye for detail and someone who relishes challenges
Walmart Global eCommerce is comprised of Walmart.com, VUDU, SamsClub.com, and our technical powerhouse @WalmartLabs. Here, innovators incubate next gen e-commerce solutions in real-time. We integrate online, physical, and mobile shopping experiences for billions of customers around the globe. How do we do it? We continuously build and invest in new technology including open source tools and big data innovations. Data scientists, front and back-end engineers, product managers, and web and UX/UI teams collaborate alongside e-commerce experts to envision, prototype, and bring revolutionary ideas to life in a dynamic, flexible and fun work culture.