About this role
Are you ready to join an exciting start-up that is revolutionizing how disinformation is handled on the internet? Get ready to join a small but growing team of highly talented engineers and leaders, building exciting AI-driven services and technologies. As a Data Engineer for Blackbird.AI, you will own the pipeline optimization for a real-time streaming cloud hosted analytics platform that spans data collection and analysis, and serves results to a user dashboard for interactive visual exploration. Our position requires a breadth of experience with database technologies, especially the engineering of horizontally scalable solutions for big data.
Job Responsibilities:
- Writes ETL processes to support ingestion and normalization of a wide variety of social media, news, and web scrape formats
- Designs database systems and develops tools for query and analytic processing, including for streaming real-time applications
- Performs analysis and comparative empirical studies to evaluate performance tradeoffs with respect to scaling (e.g., cost vs throughput/latency)
- Develops, manages and owns the database architecture for a real-time streaming cloud hosted analytics platform, spanning data collection, analytics and user management
- Owns build automation, continuous integration, deployment and performance optimization in compliance with our security requirements
Job Requirements (Must Have):
- BS degree in Computer Science or equivalent
- Demonstrated product success with deployment in the cloud and SaaS model; proven capability to develop processing pipeline for platforms that are optimized for streaming analytics applications and that are cloud agnostic (Kubernetes, dockerized solutions)
- Expert level capable on PostgreSQL, Neo4j (graph), ElasticSearch, MongoDB, Redis, Druid, with other NoSQL and graph DBs helpful
- Experienced with horizontal scaling of databases
- Experienced with Kafka and Airflow; expert in applying tools for runtime profiling to optimize throughput and latency and establish comparative performance benchmarks
- Capable in build automation, continuous integration and deployment (CI/CD) tools, e.g. Webpack, Buddy or using Jenkins + docker
- Expert level Python code development
- Experience working with distributed teams
Desired Requirements (Helpful to Have):
- Technical background in Artificial Intelligence (AI) and Machine Learning (ML)
- Experience designing and implementing interactive query-driven man-machine intelligence systems
- Solid skills in Java