Home Explore Blog CI



docker

1st chunk of `content/guides/text-summarization.md`
060225bd18afafb98d5e278682dca6a9ef0ee978b3825b6e0000000100000fca
---
title: Build a text summarization app
linkTitle: Text summarization
keywords: nlp, natural language processing, text summarization, python, bert extractive summarizer
description: Learn how to build and run a text summarization application using Python, Bert Extractive Summarizer, and Docker.
summary: |
  This guide shows how to containerize text summarization models with Docker.
tags: [ai]
languages: [python]
aliases:
  - /guides/use-case/nlp/text-summarization/
params:
  time: 20 minutes
---

## Overview

In this guide, you'll learn how to build and run a text summarization
application. You'll build the application using Python with the Bert Extractive
Summarizer, and then set up the environment and run the application using
Docker.

The sample text summarization application uses the Bert Extractive Summarizer.
This tool utilizes the HuggingFace Pytorch transformers library to run
extractive summarizations. This works by first embedding the sentences, then
running a clustering algorithm, finding the sentences that are closest to the
cluster's centroids.

## Prerequisites

- You have installed the latest version of [Docker Desktop](/get-started/get-docker.md). Docker adds new features regularly and some parts of this guide may work only with the latest version of Docker Desktop.
- You have a [Git client](https://git-scm.com/downloads). The examples in this section use a command-line based Git client, but you can use any client.

## Get the sample application

1. Open a terminal, and clone the sample application's repository using the
   following command.

   ```console
   $ git clone https://github.com/harsh4870/Docker-NLP.git
   ```

2. Verify that you cloned the repository.

   You should see the following files in your `Docker-NLP` directory.

   ```text
   01_sentiment_analysis.py
   02_name_entity_recognition.py
   03_text_classification.py
   04_text_summarization.py
   05_language_translation.py
   entrypoint.sh
   requirements.txt
   Dockerfile
   README.md
   ```

## Explore the application code

The source code for the text summarization application is in the `Docker-NLP/04_text_summarization.py` file. Open `04_text_summarization.py` in a text or code editor to explore its contents in the following steps.

1. Import the required libraries.

   ```python
   from summarizer import Summarizer
   ```

   This line of code imports the `Summarizer` class from the `summarizer`
   package, essential for your text summarization application. The summarizer
   module implements the Bert Extractive Summarizer, leveraging the HuggingFace
   Pytorch transformers library, renowned in the NLP (Natural Language
   Processing) domain. This library offers access to pre-trained models like
   BERT, which revolutionized language understanding tasks, including text
   summarization.

   The BERT model, or Bidirectional Encoder Representations from Transformers,
   excels in understanding context in language, using a mechanism known as
   "attention" to determine the significance of words in a sentence. For
   summarization, the model embeds sentences and then uses a clustering
   algorithm to identify key sentences, those closest to the centroids of these
   clusters, effectively capturing the main ideas of the text.

2. Specify the main execution block.

   ```python
   if __name__ == "__main__":
   ```

   This Python idiom ensures that the following code block runs only if this
   script is the main program. It provides flexibility, allowing the script to
   function both as a standalone program and as an imported module.

3. Create an infinite loop for continuous input.

   ```python
      while True:
         input_text = input("Enter the text for summarization (type 'exit' to end): ")

         if input_text.lower() == 'exit':
            print("Exiting...")
            break
   ```

   An infinite loop continuously prompts you for text
   input, ensuring interactivity. The loop breaks when you type `exit`, allowing
   you to control the application flow effectively.

Title: Build a Text Summarization App with Python and Docker
Summary
This guide walks you through building and containerizing a text summarization application using Python, the Bert Extractive Summarizer, and Docker. It covers setting up the application, exploring the source code, and using BERT for extractive summarization.