Team Overview

Our team is made up of 92 members and works on many areas of machine learning and data science. As a result, we have a total of five subteams: Intelligent Systems, Insights, Data Engineering, Algorithmic Trading, and Education, each with their own primary area of focus.

Executive Board

Cornell Data Science is more than just a project team – we also teach courses, lead educational initiatives, and invest in the exploration of new frontiers. For this reason, we have an executive board which manages our general operations and several subteams which lead our technical direction. Our focus is on building a strong and diverse data science community for undergraduates. We work closely with university administration, professors, and companies to prepare students for the information age.

We work at the intersection of machine learning research and practical engineering to create systems that can solve problems that humans cannot, whether that is due to scalability or accuracy concerns. Our work ranges from natural language processing to computer vision. Examples of current efforts include automatic question answering and scalable map generation for unmapped communities using object detection and IBM Research infrastructure.

Our focus is on the exploration and application of data engineering. Our members have worked on projects ranging from the creation and maintenance of CDS' own compute server cluster to automated profiling of geographic information systems. Current team objectives include the expansion of CDS infrastructure and the development of a Raspberry Pi cluster.

Our goal is to provide insight into complex systems through a combination of data science and visual interfaces. Our work is interdisciplinary and unique - our projects range from visualizing neural networks to applying natural language processing and deep learning towards detecting fake news.

Our team strives to develop algorithmic trading strategies. In general, we want to find a portfolio of stocks and buy/sell them in such a way that we long (or buy) at a lower price, and short (or sell) at a higher one. More specifically, we are applying statistical techniques to determine what equities to trade, and machine learning techniques to determine when it is appropriate to enter or exit a position. Of course, PnL (Profit and Loss) drives the business of trading, so our goal is to essentially have an algorithm that maximizes profit while not exposing ourselves to too much risk.

We train students to think and work like data scientists so they can make meaningful contributions to data science projects. Our Training Program is a semester-long course that helps students develop an overarching understanding of data science, and our Deep Dives are shorter workshops that introduce students to the latest technologies that are shaping the world of data science.

In addition to each of our subteam specific projects, we also take on many projects as an entire team. These are open to anyone regardless of subteam and usually incorporate many different aspects of software engineering, data science, and machine learning.