Summary

The Politeness Rewriter is a hybrid NLP system that detects, classifies, and rewrites user input into polite, respectful text using a combination of transformer-based classifiers and controlled text generation. It specifically combines:

  • A DistilRoBERTa politeness classifier (trained on the Stanford Politeness Corpus from ConvoKit),
  • A T5-based paraphraser conditioned for polite rewriting,
  • A modular rewrite pipeline with scoring, reranking, and explainability,
  • And an optional Gradio demo app for interactive rewriting.

The project demonstrates classifier-guided text style transfer — transforming the tone of a sentence without altering its semantic meaning. Additionally, this project was developed with the purpose for the project of 2025-2 Introduction to Natural Language Processing (001) course. Contributors are credited below this page.


Research Purpose

Politeness plays a critical role in communication, especially for social networking services within professional settings, AI chatbots, digital assistants, and automated email generation. This project aims to:

  1. Build a lightweight yet robust politeness classifier.
  2. Integrate it with a T5-style paraphraser to automatically rewrite impolite or neutral sentences into polite versions.
  3. Provide a human-interpretable pipeline where users can trace:
    • Classification probability,
    • Semantic similarity,
    • Rewrite quality.

The ultimate goal is to construct a language-generation systems that is more socially intelligent with our own training set and evaluation results.


Future Roadmaps

  • Context-Aware Rewriting
  • Multi-Style Transfer
  • Explainability
  • Better Reranker Architecture
  • Adversarial Evaluation
  • Multilingual Extension
  • User Controls in UI

Credits

Politeness Rewriter Architecture
Politeness Rewriter Main Architecture via T5
Politeness Rewriter Output
Politeness Rewriter Sample Output
Politeness Rewriter Metrics
Politeness Rewriter Performance and Statistical Metrics
Detailed Specifications
  • Domain: Natural Language Processing, Deep Learning, Artificial Intelligence, Data Science
  • Core: T5, Distilbert, Convokit, Context Awareness
  • Architecture: T5, Distilbert
  • Focus: Smart Contextualization, Linguistic Analysis, NLP System Framework