amannor
/

bert-base-uncased-sdg-classifier

+---
+tags:
+- text-classification
+- sustainable-development-goals
+- SDG
+- transformers
+- bert
+- social-impact
+license: mit
+language:
+- en
+base_model:
+- google-bert/bert-base-uncased
+---
+# SDG Startup Classifier (18-label BERT-based Model)
+[![Model](https://img.shields.io/badge/model-BERT--base--uncased-blue)](https://huggingface.co/bert-base-uncased)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![Hugging Face](https://img.shields.io/badge/HuggingFace-BERT%20SDG%20Classifier-green)](https://huggingface.co/your-hf-username/your-model-repo-name)
+---
+## Model Overview
+This model is a **BERT-base-uncased** transformer fine-tuned for multiclass classification of startup companies into **18 categories**: the 17 United Nations Sustainable Development Goals (SDGs) plus a "no-impact" label.
+It is based on the methodology and dataset described in the IJCAI 2022 paper by Kfir Bar:
+> *Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals*
+> Kfir Bar (2022) — [Paper PDF](https://github.com/Amannor/sdg-codebase/blob/master/articles/IJCAI_2022_SDGs_Methodology.pdf)
+The model takes as input textual company descriptions, mission statements, and product summaries and predicts the most relevant SDG label reflecting the company's social or environmental impact focus.
+---
+## Intended Use
+- Automatic SDG classification of startup textual descriptions, mission statements, and product/service information.
+- Support for impact investors, researchers, policymakers, and analysts interested in assessing startup alignment with SDGs.
+- Multiclass classification into all 17 SDGs plus a no-impact class, useful for comprehensive sustainability profiling.
+---
+## Model Details
+- **Architecture:** BERT-base-uncased (`bert-base-uncased` from Hugging Face Transformers)
+- **Number of labels:** 18 (17 SDGs + 1 no-impact)
+- **Tokenizer:** BERT-base-uncased WordPiece tokenizer
+- **Training data:** Proprietary dataset of startup descriptions labeled by SDG, as described in Bar (2022)
+- **Training details:** Fine-tuned using AdamW optimizer, learning rate approx. 2e-5, for multiple epochs on an annotated dataset
+- **Performance:** Approximately 77% accuracy on the 5 aggregated SDG groups, with competitive performance on the full 18-label task (per original paper)
+---
+## How to Use
+Minimal example code to load and run inference using the Hugging Face Transformers library:
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+model_name = "amannor/bert-base-uncased-sdg-classifier"
+Load tokenizer and model from Hugging Face Hub
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+Input startup description text
+text = "This startup develops affordable solar panels to improve clean energy access."
+Tokenize input text
+inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
+Forward pass
+outputs = model(**inputs)
+Predicted class index (0 to 17, aligned with SDGs + no-impact)
+predicted_label_id = torch.argmax(outputs.logits, dim=-1).item()
+print(f"Predicted SDG label ID: {predicted_label_id}")
+---
+## Limitations
+- The model relies solely on **textual company descriptions**, which might be promotional or biased (“greenwashing”).
+- Performance may degrade on short, noisy, or non-English inputs.
+- The training dataset was geographically and linguistically limited; generalization outside these domains may be suboptimal.
+- Intended to assist, not replace, expert judgment.
+---
+## Citation
+If you use this model, please cite:
+@inproceedings{bar2022ijcai,
+title={Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals},
+author={Bar, Kfir},
+booktitle={Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI)},
+year={2022}
+}
+You may also wish to reference the accompanying repository:
+https://github.com/Amannor/sdg-codebase
+---
+## License
+This model is released under the **MIT License**. For more information, see the LICENSE file in this repository.
+---
+## Links and Resources
+- [Full repository with code, notebooks, and datasets](https://github.com/Amannor/sdg-codebase)
+- [IJCAI 2022 original paper PDF](https://github.com/Amannor/sdg-codebase/blob/master/articles/IJCAI_2022_SDGs_Methodology.pdf)
+---
+*For questions or issues, please open an issue in the GitHub repository or contact the maintainer via Hugging Face.*