Lead Data Scientist, Real Estate

Capeanalytics in Palo Alto, CA/Remote

$164,000 - $246,000

CAPE Analytics provides inspection-quality condition and property characteristics data for 110 million unique structures across the United States, derived from aerial imagery via advanced computer vision. This enables HELOC originators and investors, single-family rental investors, and loan traders to access valuable property attributes with the accuracy and detail that traditionally required an on-site inspection. Leveraging CAPE Analytics, these verticals have access to more accurate property valuations, improved decisions around bids, buys, and capital expenditure needs for portfolios, and increased efficiency by avoiding time-consuming research. Founded in 2014, CAPE Analytics is backed by leading venture firms and innovative insurers and is comprised of computer vision, data science, and risk analysis experts.

THE OPPORTUNITY
As a Lead Data Scientist on CAPE’s Data Science team, you’ll collaborate with Data Scientists, Computer Vision/Machine Learning Engineers, Data Engineers, and members across Software Engineering, Product, and Sales teams to build robust, scalable machine learning models for identification and annotation of the built world. Additionally, you will develop expertise in ground truth generation, model performance analysis, iterative model development, and unsupervised mapping of the feature space to bring scientific rigor, scalability, and robust performance to our core product offerings.

As a senior member of the team, you will also oversee the work of other data scientists in the team and work with Product Managers to plan the roadmap for the team.

Over the past 6 years, we’ve constructed an analytics platform purpose-built for deep learning that has led us to be adopted by leading insurance carriers across the U.S., Canada, and Australia...but we are just getting started. On the heels of our recent $44 million Series C financing , we’re growing rapidly. In CAPE’s next phase, we’re setting out to solve the biggest problems in the Real Estate industry.

THE TECH STACK
CAPE leverages all available tools and technologies to build our best-in-class tech-stack, which affords us flexibility of fast-deployments, along with the stability to support aggressive SLAs for critical-path client APIs and applications. We build our models using Pytorch and Tensorflow, and leverage Python, Spark and Postgres across our AWS-deployed cloud infrastructure.
THE TEAM
You will work with some of the smartest data scientists in the industry. They are  passionate about the work they do and have collectively built the industry’s leading AI/Analytics product. Success only comes with great team culture, camaraderie, open communication and hard work. These are the qualities that you will experience and enjoy at Cape.

COMPENSATION & BENEFITS
Cape Analytics believes in creating a more equitable environment for everyone, and is committed to standing against wage gap disparities that are widened by limited pay transparency.

Our base range for this position is: $164,000 - $246,000.

Positions at Cape may also include stock options, bonus opportunities, and/or variable incentive pay (commissions) to supplement your base earnings. Additionally, Cape offers top-notch insurance options and competitive benefits- such as unlimited PTO, company outings, remote work capabilities and more!

We believe:

*Talent is critical, but best when tempered with humility
*Self-motivation leads to the best outcomes
*Open, direct communication is a sign of respect
*Teamwork drives success
*Having fun together is an important part of the job

***CAPE Analytics is an E-verify participant.***
    • Develop scientifically rigorous, creative methodologies to continuously improve our machine learning models
    • Incorporate machine learning and data-driven decisioning into the core of our infrastructure
    • Explore and mine new data sources that will help optimize and validate our models
    • Link model capabilities to market needs by customizing models, designing and running validation studies
    • Start to assist in Sprint planning and Quarterly planning with the team
    • Contribute to design and automation of model training, model post-processing and evaluation pipelines at scale
    • Leverage the extensive data generated by Cape in addition to data from external sources to generate structured knowledge about our feature space
    • Implement automated solutions for ensuring data quality and delivery
    • Contribute to peer mentorship, knowledge bases, and skills transfer
    • Be primarily responsible for roadmap planning with Product team along with Sprint planning and Quarterly planning
    • Present your results internally and externally
    • Defend your methodology and incorporate feedback from internal teams as well as customers
    • Improve model performance by identifying failure modes using supervised and unsupervised learning techniques
    • Ideate and implement data-driven methodologies to help scale model performance across geographical, climatic, and temporal dimensions
    • PhD in a STEM field with 5 years of hands-on industry experience or Masters in a STEM field with 7 years of hands-on industry experience
    • 1-3 yrs of technical management experience of other data scientists
    • A background in the Finance or Real Estate sector is strongly preferred. This includes familiarity with Real Estate data such as MLS and other public record data, Mortgage Loans, Automated Valuation Models, Asset Valuations, Cash Flow Analysis, Risk Analysis etc.
    • Solid knowledge of statistical techniques, including hypothesis testing, statistical sampling, significance testing, statistical inference, maximum likelihood estimation, and experimental design, among others
    • Mastery of, supervised and unsupervised algorithms and their implementations, machine learning concepts including regularization, learning curves,  optimizing hyperparameters, cross-validation, among others
    • Advanced knowledge and significant programming experience in Python programming or other scripting language including relevant libraries like numpy, pandas, SciPy, matplotlib
    • Familiarity with the Linux environment including shell scripting, Git and tools for reproducibility (e.g.  virtual environments, Docker)
    • Demonstrated expertise in building data tools for ETL and data analysis
    • Experience in building meaningful data visualizations using at least one scripting-based visualization tool such as matplotlib, d3.js or bokeh
    • Nice to haves: Experience designing data schemas and extracting data from SQL and NoSQL databases. Experience with GIS systems. Experience with modern data technologies, e.g. Spark, pytorch, Jupyter Notebook, DockerExperience with cloud computing on AWS or GCP
Apply