YData was recognized as the best synthetic data vendor! Read the complete benchmark.
From model-centric to data-centric

From model-centric to data-centric

A new paradigm for AI development — focused on data quality In my last blog post I’ve covered the rise of DataPrepOps and the importance of data preparation to achieve optimized results from Machine Learning based solutions. The stakes of...

The rise of DataPrepOps

The rise of DataPrepOps

Modern data development tools and how data quality impacts ML results ML is all around us! From healthcare to education, it is being applied in many domains that affect our daily activities and it’s able to deliver many benefits. Data...

How to go from raw data to production like a pro

How to go from raw data to production like a pro

An odyssey on improving data quality with synthetic data and model delivery with MLOps Machine Learning and AI are two concepts that definitely have changed our way of thinking in the last decade, and will probably change even more in the...

Time-series Synthetic Data: A GAN approach

Time-series Synthetic Data: A GAN approach

Generate synthetic sequential data with TimeGAN Time-series or sequential data can be defined as any data that has time dependency. Cool, huh, but where can I find sequential data? Well, a bit everywhere, from credit card transactions, my...

Data Pipeline Selection and Optimization

Data Pipeline Selection and Optimization

In recent years, machine learning has revolutionized how businesses and organizations operate. However, one aspect that is often overlooked is the importance of data pipelines in influencing machine learning performance. In this paper, the...

Learn from Data Science

What we have learned from talking with 100+ data scientists

One good thing about the current pandemic (probably the only good thing) is that everyone stopped spending time commuting and got to spend that time on something else. We’re glad that some of those people were kind enough to spend that...

Do something great

Startups portuguesas receberam mais de 275 milhões de euros (PT)

Mais de metade das startups financeiras portuguesas estão sediadas em Lisboa, 19% escolheu outros países da Europa e 18% está no Porto. Investidores apontam localização das sedes como um obstáculo. O Top 30 das startups de tecnologia...

data science focused on data, container, Kubernetes

Should Data Science teams use Kubernetes? Hell no!

Data science teams should focus on analysing data and building models, not infrastructure management. Kubernetes is great! 1. “Kubernetes is a future proof solution.” Because it is super cool to say “future proof”. Nobody knows how the...

 Air quality in California, San Francisco United State, Golden Gate Bridge

A Machine Learning Approach to Predict Air Quality in California

Predicting air quality is a complex task that has become increasingly relevant in urban areas due to air pollution's critical impact on human health and the environment. In this context, machine learning techniques have proven to be...

Muslim family holding signs

How to deal with bias in data?

Reducing your AI bias with synthetic data In the latest days, countries have been assaulted by manifestations around a topic that we do not always give the attention we should: inequalities and discrimination in our society towards black...

Private sign

What is Differential Privacy?

Does it live up to the hype? Nowadays, it’s said that we can quantify privacy or, even better, we can rank privacy-preserving strategies and recommend the more effective ones. Well, we can suggest something that goes even a bit further and...

A computer showing a dashboard on analytics results.

Synthetic Data: the future standard for Data Science development

In today’s world where data science is ruling every industry, the most valuable resource for a company are not the machine learning algorithms, but the data itself. Since the rise of Big Data, a theoretical understanding that data is...

Subscribe our newsletter for latest updates