Skip to content

Code for the ACL 2024 paper "Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts"

License

Notifications You must be signed in to change notification settings

danushkhanna/predex-legal-prediction-explainability

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

L-NLP
Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts (ACL 2024)

task_desc

[🌐 Website][📜 Paper][🤗 HF Models][🤗 HF Dataset][🐱 GitHub]

This is the official implementation of the paper:

Shubham Kumar Nigam, Anurag Sharma, Danush Khanna, Noel Shallum, Kripabandhu Ghosh, and Arnab Bhattacharya:

Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts (to appear in ACL 2024)

LLMs, used for legal outcome prediction and explainability, face challenges due to the complexity of legal proceedings and limited expert-annotated data. PredEx tackles this with the largest expert-annotated dataset based on Indian legal documents, featuring over 15,000 annotations. Our best Transformer model, Roberta, achieves 78% accuracy, surpassing LLama-2-7B at 38% and human experts at 73%. PredEx sets a new benchmark for legal judgment prediction in the NLP community!
See also our Linkedin Post.

PredEx can be used to improve the performance of already-trained large language models not only in legal outcome prediction but also in providing meaningful reasoning behind their decisions. For best results, the models can be trained with PredEx.

If you have any questions on this work, please open a GitHub issue or email the authors at

shubhamkumarnigam@gmail.com, anuragsharma3211@gmail.com, danush.s.khanna@gmail.com

May 2024 - PredEx will appear at ACL 2024!

Getting Started

General Instructions

Ensure you have the necessary hardware and software requirements in place to replicate our experimental setup. Follow the steps below to configure your environment for optimal performance.

Recommended Hardware Configuration

Hardware Specifications

  • Utilize two cores of NVIDIA A100-PCIE-40GB with 126GB RAM of 32 cores for instruction fine-tuning.
  • Additionally, a Google Colab Pro subscription with A100 Hardware accelerator is recommended for conducting inference and other experiments.

Recommended Software Configuration

Software Setup

  • Set up the environment with appropriate drivers and libraries for GPU acceleration.
  • Install necessary dependencies for model training and inference.

Model Training Specifics

Fine-tuning Parameters

  • Fine-tune the Large Language Models (LLMs) for 5 epochs to achieve a balance between training adequacy and preventing overfitting.

Post-processing for Quality Enhancement

  • Implement a post-processing step after inference to mitigate common issues with generative models, such as sentence hallucination and repetition.
  • Select the initial occurrences of decision and explanation parts from the model outputs and omit subsequent repetitions to refine output quality, ensuring coherence and conciseness.

Evaluation Process

Handling Non-inferential Results

  • Exclude cases where certain LLMs do not yield inference results to maintain the integrity and accuracy of experimental findings.
  • By excluding non-inferential results, ensure that the evaluation process remains unbiased and reflective of the models' performance.

Trained Models

The following models from the paper (Table 3) are available on Hugging Face.

Table 1: Prediction only, LM-based models on PredEx

Dataset Method Hugging Face link
Predex InLegalBert L-NLProc/PredEx_InLegalBert_Pred
Predex InCaseLaw L-NLProc/PredEx_InCaseLaw_Pred
Predex XLNet Large L-NLProc/PredEx_XLNet_Large_Pred
Predex RoBerta Large L-NLProc/PredEx_RoBERTa_Large_Pred

Table 2: Prediction only, LLM-based models on PredEx

Dataset Method Hugging Face link
Predex Zephyr Zephyr Hugging Face API
Predex Gemini pro Build with Gemini API
Predex Llama-2-7B L-NLProc/PredEx_Llama-2-7B_Pred
Predex Llama-2-7B Instruction-tuning on prediction task L-NLProc/PredEx_Llama-2-7B_Pred_Instruction-Tuned

Table 3: Prediction with the explanation on PredEx, LLM-based models

Dataset Method Hugging Face link
Predex Gemini pro Build with Gemini API
Predex Llama-2-7B L-NLProc/PredEx_Llama-2-7B_Pred-Exp
Predex Llama-2-7B Instruction-tuning on prediction with explanation task L-NLProc/PredEx_Llama-2-7B_Pred-Exp_Instruction-Tuned

Table 4: Prediction with the explanation on ILDC Expert, LLM-based models

Dataset Method Hugging Face link
ILDC Expert Llama-2-7B L-NLProc/ILDC_Llama-2-7B_Pred-Exp
ILDC Expert Llama-2-7B Instruction-tuning on prediction with explanation task L-NLProc/ILDC_Llama-2-7B_Pred-Exp_Instruction-Tuned

Results

image

image

image

Citation

If you use our method or models, please cite our paper:

@misc{nigam2024legal,
      title={Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts}, 
      author={Shubham Kumar Nigam and Anurag Sharma and Danush Khanna and Noel Shallum and Kripabandhu Ghosh and Arnab Bhattacharya},
      year={2024},
      eprint={2406.04136},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
} 

About

Code for the ACL 2024 paper "Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 81.6%
  • Python 18.4%