Date format for tine-tuning AI models

kegintheai · May 20, 2026, 5:38am

Hi. I’m struggling with the date format to use in my datasets to fine-tune QwenX.X and Gemma-X.X models. The models is always inferring the wrong dates after fine-tuning. I tried several formats as shown in the example below to no avail. I would appreciate any guidance regarding the right date format to use or how to discover it for a given model.

Examples of date formats I test but none of them worked:

user: When did XYZ work for ABC. Assistant: From January 2021 till December 2021.

user: When did XYZ work for ABC. Assistant: From 01, 2021 till 12, 2021.

user: When did XYZ work for ABC. Assistant: From 01-01-2021 till 12-01- 2021.

user: When did XYZ work for ABC. Assistant: From 01-01-2021 till 12-01- 2021 .

user: When did XYZ work for ABC. Assistant: From January 2021 01-01-2021 till December 2021 12-01- 2021 .

Vultieris · May 20, 2026, 9:21am

The issue is usually consistency and how the training examples are formatted. Avoid mixing natural language, commas, spaces, and different date styles in the same dataset. Also avoid adding fake exact days if your source data only has month/year.

I’d use one clear format everywhere, for example:

From 2021-01 to 2021-12

or, if you need exact dates:

From 2021-01-01 to 2021-12-01

kegintheai · May 20, 2026, 11:18am

Thanks @Vultieris I appreciate your valuable feedback. I’ll test it. Cheers!

kegintheai · May 21, 2026, 3:36pm

Unfortunately it didn’t work out. My fine-tuned model is still making up invalid dates. I’m beginning to think the issue lies elsewhere outside the scope of date formats. I’ll keep digging!

I should add that I have witnessed this issue across the board while fine-tuning several QwenX.X and LlamaX.X models using SFT. I have traced the tokenizing of the dates and displayed the trainer’s batch input-ids right before starting training. They matched the dates found in the dataset.

Topic		Replies	Views
Dataset format standards for chat-based, fine-tuned Llama models 🤗Datasets	4	6900	December 9, 2025
How to determine the data format when creating a custom dataset for a given task? 🤗Transformers	0	191	April 18, 2023
Fine Tuning Format/Structure for data for llma3.1 models Intermediate	0	86	October 28, 2024
autoTrain data format for SFT fine tuning 🤗AutoTrain	0	77	August 30, 2024
Fine Tune text generation Model using different type of data 🤗Transformers	0	394	August 1, 2023

Date format for tine-tuning AI models

Related topics