LSTM Text Generation Model

This model was trained using TensorFlow/Keras for financial article generation tasks.

Model Details

  • Model Type: LSTM
  • Framework: TensorFlow/Keras
  • Task: Text Generation
  • Vocabulary Size: 41376
  • Architecture: Long Short-Term Memory (LSTM)

Usage

from huggingface_hub import snapshot_download
import tensorflow as tf
import json
import pickle
import numpy as np

# Download model files
model_path = snapshot_download(repo_id="firobeid/L4_LSTM_financial_article_generator")

# Load the LSTM model
model = tf.keras.models.load_model(f"{model_path}/lstm_model")

# Load tokenizer
try:
    # Try JSON format first
    with open(f"{model_path}/tokenizer.json", 'r', encoding='utf-8') as f:
        tokenizer_json = f.read() 
    tokenizer = tf.keras.preprocessing.text.tokenizer_from_json(tokenizer_json)
except FileNotFoundError:
    # Fallback to pickle format
    with open(f"{model_path}/tokenizer.pkl", 'rb') as f:
        tokenizer = pickle.load(f)

# Text generation function
def generate_text(input_text, num_words=10):
    # Preprocess input
    X = np.array(tokenizer.texts_to_sequences([input_text])) - 1
    
    # Generate predictions
    output_text = []
    for i in range(num_words):
        y_proba = model.predict(X, verbose=0)[0]
        pred_word_ind = np.argmax(y_proba, axis=-1) + 1
        pred_word = tokenizer.index_word[pred_word_ind[-1]]
        
        input_text += ' ' + pred_word
        output_text.append(pred_word)
        X = np.array(tokenizer.texts_to_sequences([input_text])) - 1
    
    return ' '.join(output_text)

# Example usage
# Start with these tags: <business>, <entertainment>, <politics>, <sport>, <tech>
result = generate_text("<tech> The future of artificial intelligence", num_words=15)
print(result)

Training

This model was trained on text data using LSTM architecture for next-word prediction.

Limitations

  • Model performance depends on training data quality and size
  • Generated text may not always be coherent for longer sequences
  • Model architecture is optimized for the specific vocabulary it was trained on
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support