What does data and technology have to do with a top ten bank? The answer is “Everything”. At Capital One, data lives at the center of everything we do. When we launched in 1988 we quickly disrupted the credit card industry by individually personalizing every credit card offer using statistical modeling and relational databases. Fast-forward and this little innovation and our passion for data has skyrocketed us to become a Fortune 200 company and leader in the world of data-driven decision-making.
We take our mission of Changing Banking for Good very seriously. As a Data Scientist at Capital One, you’ll be part of a team that’s leading the next wave of disruption at a whole new scale, using the latest in distributed computing technologies and operating across billions and billions of customer transactions to unlock the big opportunities that help everyday people save money and remove the “friction” from their financial lives.
The Capital One Data Science campus internship program will provide you with the chance to see first-hand how your skills and education translate into solving business problems in the financial services industry. As a data science intern you will get the opportunity to work closely with our data science community to solve various financial services problems using cutting edge techniques.
Interns are a full-fledged member of the team—engaged from day one through structured training, a diverse set of experiences, a network of peers across the company, and leadership opportunities. The program is designed to provide exposure to real, complex operational projects.
On any given day you might:
Evaluate open source and internally-developed modeling and analytics tools. These are both “big” and “little” data oriented and could be developed in R, Python, C/C++, Scala, H2O, or a language that is “TBD” and waiting to be unpacked. Finally, you will have the opportunity to work with AWS, internal Hadoop, Spark, Pig, Hive, Impala, and other processing stack elements.
Share evaluation results and insights with data science teams via internal blogs, Stack Exchange, and GitHub.
Write tools, wrappers, and scripts to help teammates perform their jobs more efficiently and effectively. Example – build a standard pipeline that does grid-search model parameter estimation and then flows straight into a series of diagnostics about how well the model that was just produced performs.
Write software to clean and investigate large, messy data sets of numerical and textual data
Integrate with external data sources and APIs to discover interesting trends (NOAA Weather Data + Credit Card Transactions = Fascinating!)
Design rich data visualizations to communicate complex ideas to customers and company leadership
Develop an internal portal to streamline processes and accelerate discovery time for new information related to tools, APIs, etc.
The Ideal Candidate will be:
Smart. You’re a top performing student.
Curious. You ask why, you explore, you're not afraid to introduce and defend a crazy idea. Create machine learning models or data sets to “challenge” existing models— “breaking” developed models to ensure reliability and resiliency. Review and replicate models and create real and thought experiments to determine if and how the model holds up under different scenarios.
Data Savvy You know how to move data around, from a database or an API, through a transformation or two, a model and into human-readable form (ROC curve, chart, map, visualization, etc.). You know Python, Java, R, C/C++, Spark, Storm, Julia, SQL, Matlab, Mahout, or think everything can be done in a Perl one-liner.
Do-er. You have a bias toward action, you try things, and sometimes you fail. Expect to tell us what you’ve shipped and what’s flopped.
Fearless. Big, undefined problems and petabytes don’t frighten you. You’re not intimidated by new tools or technologies.
Basic Qualifications:
Have obtained or will obtain at least a Masters or PHD degree in a quantitative field of study between December 2016 and August 2017
At least 6 months of experience or course work in open source programming languages for data analysis
At least 6 months of experience or course work in machine learning or predictive analytics
Preferred Qualifications:
Direct experience with either Python or R plus one other general purpose programming language such as Java or C/C++
Experience or course work with relational databases
Experience or course work with large scale data analysis
Experience or course work in statistical modeling
Capital One will consider sponsoring a new qualified applicant for employment authorization for this position.
At Capital One, we’re changing banking for good and helping people live their best lives. We’re building one of America’s leading information-based technology companies, using our digital fluency to transform everything about the customer experience. We think and work like a tech company, always ready to dream, disrupt and deliver a better way—for our customers, the financial industry, and each other. Relentless innovation is our way of life.