Software Engineer, Data Engineering

Appier in Tokyo,Japan

About Appier

Appier is a software-as-a-service (SaaS) company that uses artificial intelligence (AI) to power business decision-making. Founded in 2012 with a vision of democratizing AI, Appier’s mission is turning AI into ROI by making software intelligent. Appier now has 17 offices across APAC, Europe and U.S., and is listed on the Tokyo Stock Exchange (Ticker number: 4180). Visit www.appier.com for more information.

About the role

At Appier, we have many opportunities to work with data each and every day. In this role as a Software Engineer, Data Engineering on the Appier Data Engineering team, your primary responsibility will be to partner with key stakeholders, machine learning scientists, data analysts, and software engineers to support and enable the continued growth critical to Appier. You will be responsible for creating the technology and data architecture that moves, translates and stores data used to improve our AI capacity and provide insight to our customers. You will also help translate business needs into requirements and identify efficiency opportunities. In addition to extracting, transforming and storing data, you will be expected to use your expertise to build extensible data models and data government, provide meaningful recommendations and actionable strategies to partnering machine learning scientists and data analysts for performance enhancements and development of best practices, including streamlining of data sources and related programmatic initiatives. The ideal candidate will have a passion for working in white space and creating impact from the ground up in a fast-paced environment. We are looking for all levels of seniority in the space. This is a local hire position.

Responsibilities

  • Partner with leadership, engineers, product managers, data scientists, and data analysts to understand data needs
  • Apply proven expertise and build high-performance scalable data warehouses
  • Design, build and launch efficient & reliable data pipelines to move and transform data
  • Securely source external data from numerous partners
  • Intelligently design data models for optimal storage and retrieval
  • Deploy inclusive data quality checks to ensure high quality of data
  • Optimize existing pipelines and maintain of all domain-related data pipelines
  • Ownership of the end-to-end data engineering component of the solution
  • Support on-call shift as needed to support the team
  • Design and develop new systems in partnership with software engineers and scientists to enable quick and easy consumption of data

About you

[Minimum qualifications]

  • BS/MS in Computer Science or a related technical field
  • 2+ years of Python or other modern programming language development experience
  • 2+ years of SQL and relational databases experience
  • 2+ years experience in custom ETL design, implementation and maintenance
  • Experience with workflow management engines (i.e. Airflow,, Google Cloud Composer, AWS Step Functions, or Azure Data Factor)
  • Experience with Data Modeling
  • Experience with operating Spark or Hadoop farm
  • Experience with managing data storage using HDFS and Cassandra

[Preferred qualifications]

  • Experience with more than one coding language (i.e. Sala or Java)
  • Contributing to open source projects a huge plus (Please include your github page)
  • Experience designing and implementing real-time pipelines
  • Experience with data quality and validation
  • Experience with SQL performance tuning and end-to-end process optimization
  • Experience with anomaly/outlier detection
  • Experience with notebook-based Data Science workflow
  • Experience with Hadoop, Hive, Flink, Storm, Presto and related big data systems is a plus
  • Experience with Public Cloud like AWS, Azure, or  GCP is a plus
Apply