We are developing a web application that presents interactive charts alongside selected database information. These charts visualize the outcomes of a proprietary algorithm designed for analyzing large-scale engineering datasets. Our ultimate goal is to enable end users to interact with these graphs dynamically in real time.
Key Features of the Project
Algorithm Integration:
Our core algorithm, developed in Python and C#, is an advanced statistical tool that evaluates complex engineering equations by comparing fitted curves against empirical data to produce optimized results.
Data Scale & Complexity:
The project involves handling exceptionally large datasets. Inputs consist of hundreds of millions of database rows, while algorithm outputs could scale up to trillions of rows. Optimizing database architecture and SQL procedures is critical.
Performance Metrics:
Results from the algorithm are assessed against benchmark recommendations based on:
- Statistical Goodness of Fit, and
- Economic Advantage
The sorted results will be visualized on a SaaS platform, highlighting their statistical performance. Our objective is to automate the processing of 500,000 to 800,000 properties in a single run.
Responsibilities
- Design, optimize and enhance database architecture to efficiently handle large-scale data operations.
- Develop and implement robust data pipelines for seamless input/output between the algorithm and database.
- Ensure data integrity, security and performance across all databases.
- Collaborate with development team to enable dynamic real-time user interaction with chart data.
- Contribute to algorithm refinement to improve statistical modeling and economic insights.
- Participate in the future implementation of machine learning models to enhance data processing.
Qualifications
- Bachelors or Masters in Computer Science, Data Science, Data Analytics (Or a related field)
- Strong knowledge of SQL and experience optimizing large-scale databases.
- Proficiency in database design, indexing, and query optimization techniques.
- Strong Experience with Azure data pipeline tools and ETL processes.
- Familiarity with big data technologies and data warehousing solutions.
- Exceptional knowledge of C# and its integration with databases.
- Solid understanding of database security practices.
- Ability to work with backend technologies including Azure, MongoDB and NoSQL for integrating databases with applications.
- Working knowledge of prompt engineering techniques including ChatGPT and an eagerness to adopt and deploy artificial intelligence into your workflow.
- Interest/experience in machine learning frameworks (TensorFlow, PyTorch) is advantageous.
- Comfortable with using project and time management tools such as Asana, Slack or Jira
- Strong analytical and problem-solving skills.
- Ability to work independently and collaboratively within a team.
- Excellent communication skills.