Skip to content

Model - Getting Started

The model component is a Python microservice that serves a fine-tuned BertForSequenceClassification model for 7-class Turkish news categorization. It exposes both a FastAPI HTTP endpoint (/predict) and a gRPC endpoint (ModelService.Predict) simultaneously from a single process.

The model is published on HuggingFace Hub: mehmetraufoguz/turkish-news-bert-base

Prerequisites

Installation

cd model
pip install -r requirements.txt

If you intend to use the combined or gRPC server, generate the protobuf stubs:

python -m grpc_tools.protoc \
  -I proto \
  --python_out=. \
  --grpc_python_out=. \
  proto/model.proto

The generated model_pb2.py and model_pb2_grpc.py files are already committed in the repository, so this step is only needed if you modify proto/model.proto.

Environment Variables

VariableDefaultDescription
MODEL_SAVE_DIR./saved_modelPath to the saved model directory
API_PORT8000FastAPI HTTP port
GRPC_PORT50051gRPC server port

Running the Inference Service

Combined server (FastAPI + gRPC, recommended)

Runs both servers concurrently in a single process:

MODEL_SAVE_DIR=./saved_model python server.py

FastAPI only

MODEL_SAVE_DIR=./saved_model uvicorn app:app --host 0.0.0.0 --port 8000

gRPC only

MODEL_SAVE_DIR=./saved_model python grpc_server.py

Docker

A Dockerfile is provided in the model/ directory. Build and run with:

docker build -t aa-news-model ./model
docker run -p 8000:8000 -p 50051:50051 \
  -e MODEL_SAVE_DIR=/app/saved_model \
  aa-news-model

Or use the root docker-compose.yml to start the full stack.

API Reference

GET /health

Returns the current service status.

Response
{
  "status": "ok",
  "model": "BertForSequenceClassification",
  "num_labels": 7,
  "device": "cpu"
}

POST /predict

Classifies a news article by headline and summary.

Request body
{
  "id": "article-123",
  "baslik": "Merkez Bankası faiz kararını açıkladı",
  "ozet": "Para politikası kurulu toplantısında faiz oranı sabit tutuldu."
}
FieldTypeConstraints
idstringAny string
baslikstringmin_length=1
ozetstringmin_length=1
Response body
{
  "predicted_category": 1,
  "confidence": 0.9621,
  "all_confidences": {
    "POLITIKA": 0.0183,
    "EKONOMI": 0.9621,
    "SPOR": 0.0041,
    "SAGLIK": 0.0058,
    "KULTUR_SANAT": 0.0031,
    "DUNYA": 0.0042,
    "TEKNOLOJI": 0.0024
  }
}
FieldDescription
predicted_categoryInteger ID (0–6) of the predicted class
confidenceSoftmax probability of the predicted class
all_confidencesSoftmax probabilities for all 7 classes by label
Error responses
  • 503 Service Unavailable — model not loaded yet
  • 422 Unprocessable Entity — input is empty after preprocessing

gRPC

The gRPC service is defined in proto/model.proto:

service ModelService {
  rpc Predict (PredictRequest) returns (PredictResponse);
}
 
message PredictRequest {
  string id     = 1;
  string baslik = 2;
  string ozet   = 3;
}
 
message PredictResponse {
  int32              predicted_category = 1;
  float              confidence         = 2;
  map<string, float> all_confidences    = 3;
}

Default port: 50051.

Training

To fine-tune the model yourself:

python train.py \
  --dataset-name mehmetraufoguz/turkish-news-dataset \
  --hf-token YOUR_HF_TOKEN \
  --num-epochs 3 \
  --batch-size 16 \
  --lr 2e-5 \
  --output-dir ./saved_model \
  --wandb-project turkish-news-encoder \
  --push-to-hub \
  --hub-model-id YOUR_HF_USERNAME/turkish-news-bert-base

Key flags:

FlagDescription
--no-wandbDisable W&B logging
--push-to-hubPush the trained model to HuggingFace Hub
--hub-model-idHuggingFace repository ID for the push
--gdrive-checkpoint-dirCopy checkpoints to Google Drive (useful for Colab)
--seedRandom seed for reproducibility