Automating law text labeling with LLMs
This project builds an end-to-end pipeline (GitHub repo) for automating the labeling of self-expression laws using a combination of human annotations, large language models, and fine-tuned transformer architectures. I began with human-coded laws containing provision keys (i.e., a general statement of a legal rule) and deontic indicators (i.e., whether the legal rule was explicitly included or excluded in the law), then used GPT-5 to extract the relevant legal text and reconstruct a structured training dataset. To address class imbalance, I generated targeted synthetic observations, producing a balanced dataset suitable for training.
I then fine-tuned a LEGAL-BERT model on Azure ML using LoRA adapters and a multi-head classifier that jointly predicts provision type and deontic status. For evaluation, test laws were split into clause-level inputs with GPT-5-mini under strict and relaxed relevance settings, passed through the model, and compared against human labels. Performance was assessed using F1, precision, and recall overall and by provision key, yielding a reproducible workflow for scaling legal text classification across hundreds of variables. While still a work in the progress, the model outperformed a zero-shot pipeline using ChatGPT 4o.
I then fine-tuned a LEGAL-BERT model on Azure ML using LoRA adapters and a multi-head classifier that jointly predicts provision type and deontic status. For evaluation, test laws were split into clause-level inputs with GPT-5-mini under strict and relaxed relevance settings, passed through the model, and compared against human labels. Performance was assessed using F1, precision, and recall overall and by provision key, yielding a reproducible workflow for scaling legal text classification across hundreds of variables. While still a work in the progress, the model outperformed a zero-shot pipeline using ChatGPT 4o.

