African Datasets

Empowering Africa Through Data Innovation

The Deep Learning Indaba is delighted to announce a Call for African Datasets as part of the Deep Learning Indaba 2026, taking place in Nigeria. Our mission is to strengthen machine learning (ML) and artificial intelligence (AI) capacity across the African continent. A cornerstone of this vision is the development and sharing of datasets originating from, relevant to, or centered on Africa that reflect the diverse realities, challenges, and opportunities.

Join Us in Shaping Africa's Future

This initiative positions data not merely as a technical asset, but as a form of digital public infrastructure critical to building equitable, impactful, and globally relevant AI systems. We aim to create a trusted, high-quality repository of Africa-relevant datasets that can be used by researchers, practitioners, startups, policymakers, and civil society organizations. These datasets will enable the development of AI solutions tailored to Africa’s unique contexts, challenges, and opportunities. By fostering the collection and sharing of African datasets, we aim to:

Bridge the Data Gap

Address the current lack of datasets accurately representing African languages, cultures, environments, and challenges.

Catalyze Collaboration

Build a community of contributors and users committed to advancing African AI.

Empower Local Innovation

Provide researchers and developers with the tools to create solutions that are not only innovative but also contextually relevant and impactful.

Promote Ethical Data Use

Ensure data collection and usage prioritize privacy, consent, and equity, setting a gold standard for ethical AI development.

Why this Matters

Data is the lifeblood of AI and machine learning. Yet, much of the data used to train today’s systems comes from contexts that do not reflect the African experience. Without representative data, the solutions developed may fail to address the continent’s specific needs or perpetuate bias and inequality.
This initiative is more than just a call for data; it is a call to action for inclusivity and equity in the global AI landscape. African datasets will:

Enhance Representation

Ensure that Africa’s diversity is reflected in AI models, contributing to global technological progress that benefits all.

Drive Problem-Solving

Equip researchers with the tools to tackle Africa-specific challenges in health, education, agriculture, and urban development.

Shape Policy and Governance

Support evidence-based decision-making and create policies rooted in accurate, local data.

What We are Looking For

We invite contributions from individuals, academic institutions, private organizations, NGOs, and government bodies. Submissions can include, but are not limited to, datasets in the following domains:

Infrastructure

Urban planning, transportation, and geospatial data for infrastructure development.

Language & Culture

Text, audio, and video datasets representing African languages, dialects, and cultural heritage.

Environment

Biodiversity records, conservation data, and climate information

Health

Public health records, medical imaging, disease surveillance, and epidemiology.

Agricultural

Data on crop health, soil conditions, weather patterns, and resource management.

Educational

Learning outcomes, digital education tools, and multilingual datasets.

NB: Submissions that cut across multiple domains or disciplines are welcome and encouraged. All contributions must adhere to ethical guidelines, including data anonymization, informed consent, and compliance with local and international data governance standards.

Innovative Focus Areas

To push the frontier of African AI, we especially encourage datasets that:

  • Support low-resource and underrepresented African languages
  • Enable multimodal learning (e.g., text + audio + vision)
  • Capture longitudinal or temporal dynamics (e.g., climate trends, health outcomes, education trajectories)
  • Include community-annotated or participatory data collection
  • Address real-world deployment constraints, such as limited connectivity or compute

Contribute with Your Dataset

Step 1

Prepare Your Dataset: Format and document your dataset comprehensively, ensuring anonymisation where applicable. Submissions must comply with ethical standards and relevant data protection regulations.

Step 2

Confirm Ownership and Permissions: Only submit datasets that you created yourself or for which you have explicit permission from all relevant collaborators, institutions, or supervisors. By submitting, contributors confirm they have the right to share the dataset in the manner described.

Step 3

Select a License: Clearly state the license under which the dataset is shared (e.g., open, restricted, or on-request), and any conditions for reuse.

Step 4

Complete the Submission Form: Visit the application form to provide detailed metadata about your dataset, including its source, structure, and potential applications.

Step 5

Upload Your Dataset: Use our secure platform to upload your files or provide links to external repositories.

Key Dates for Submission

Mark your calendars with these essential deadlines to ensure your submission is considered for review.

February 20, 2026

Applications Open

March 20, 2026

Applications close

May 4, 2026

Notification of Selection

June 1, 2026

Acceptance of invitation

Benefits and Collaboration opportunities

As a contributor, you will:

  • Receive formal acknowledgment on the Deep Learning Indaba platform and in downstream research outputs
  • Gain visibility within a pan-African and global AI community
  • Access a growing repository of African datasets to support their own research and innovation
  • Be considered for dataset spotlight sessions, posters, or workshops at Deep Learning Indaba 2026
  • Join a network of collaborators shaping the future of ethical, inclusive AI in Africa.

Dataset Tracks and Awards

To recognize excellence and encourage high-impact contributions, selected datasets may be featured under dedicated Dataset Tracks and Awards, including (but not limited to):

  • Best Community-Centered Dataset – for datasets developed through participatory or community-led approaches
  • Best Low-Resource Language Dataset – advancing representation of underrepresented African languages and dialects
  • Best Dataset for Social Impact – addressing critical challenges in health, education, climate, or economic development
  • Best Student or Early-Career Dataset,  highlighting outstanding contributions from emerging researchers

Awarded datasets may receive special recognition during the conference, increased visibility on the Indaba platform, and prioritization for follow-on collaborations.

Join Us in Nigeria

This Call for African Datasets is part of the broader conversation at Deep Learning Indaba 2026 in Nigeria. The conference will convene researchers, practitioners, policymakers, and industry leaders for deep technical exchange, community building, and collaboration.

Whether you are a researcher, policymaker, industry leader, or community advocate, your contribution matters. Together, we can build AI systems that reflect, respect, and serve the African continent on its own terms.

For questions or more information, contact riad@deeplearningindaba.com or samueloladejo@deeplearningindaba.com 

Let’s make African data the foundation for innovation that transforms lives and builds a brighter future for all!

Post-Indaba Sustainability and Stewardship

We recognize that meaningful impact requires datasets to remain useful beyond a single conference cycle. As such, we aim to explore post-Indaba sustainability through:

  • Long-term hosting or trusted repository partnerships
  • Clear licensing and governance frameworks to support responsible reuse
  • Ongoing community engagement around dataset updates, maintenance, and documentation

Where feasible, we will facilitate continued collaboration between dataset contributors, users, and institutional partners to ensure that these datasets remain living resources for African AI research and innovation.