Data Engineer at blackbird.ai

Apply Now

Remote, Remote, United States

Hourly

Contract

View all jobs at blackbird.ai

About this role

Are you ready to join an exciting start-up that is revolutionizing how disinformation is handled on the internet? Get ready to join a small but growing team of highly talented engineers and leaders, building exciting AI-driven services and technologies. As a Data Engineer for Blackbird.AI, you will own the pipeline optimization for a real-time streaming cloud hosted analytics platform that spans data collection and analysis, and serves results to a user dashboard for interactive visual exploration. Our position requires a breadth of experience with database technologies, especially the engineering of horizontally scalable solutions for big data.




Job Responsibilities:

  • Writes ETL processes to support ingestion and normalization of a wide variety of social media, news, and web scrape formats
  • Designs database systems and develops tools for query and analytic processing, including for streaming real-time applications
  • Performs analysis and comparative empirical studies to evaluate performance tradeoffs with respect to scaling (e.g., cost vs throughput/latency)
  • Develops, manages and owns the database architecture for a real-time streaming cloud hosted analytics platform, spanning data collection, analytics and user management
  • Owns build automation, continuous integration, deployment and performance optimization in compliance with our security requirements



Job Requirements (Must Have):

  • BS degree in Computer Science or equivalent
  • Demonstrated product success with deployment in the cloud and SaaS model; proven capability to develop processing pipeline for platforms that are optimized for streaming analytics applications and that are cloud agnostic (Kubernetes, dockerized solutions)
  • Expert level capable on PostgreSQL, Neo4j (graph), ElasticSearch, MongoDB, Redis, Druid, with other NoSQL and graph DBs helpful
  • Experienced with horizontal scaling of databases
  • Experienced with Kafka and Airflow; expert in applying tools for runtime profiling to optimize throughput and latency and establish comparative performance benchmarks
  • Capable in build automation, continuous integration and deployment (CI/CD) tools, e.g. Webpack, Buddy or using Jenkins + docker
  • Expert level Python code development
  • Experience working with distributed teams



Desired Requirements (Helpful to Have):

  • Technical background in Artificial Intelligence (AI) and Machine Learning (ML) 
  • Experience designing and implementing interactive query-driven man-machine intelligence systems
  • Solid skills in Java

Apply Now

Powered by