Home Explore Blog CI



kubernetes

content/en/docs/doc-contributor-tools/linkchecker/README.md
b790586a1371ce10080fd2409e5386ff47c11874bb2d91440000000300000d8e
# Internal link checking tool

You can use [htmltest](https://github.com/wjdp/htmltest) to check for broken links in
[`/content/en/`](https://git.k8s.io/website/content/en/). This is useful when refactoring
sections of content, moving pages around, or renaming files or page headers.

## How the tool works

`htmltest` scans links in the generated HTML files of the kubernetes website repository.
It runs using a `make` command which does the following:

- Builds the site and generates output HTML in the `/public` directory of your
  local `kubernetes/website` repository
- Pulls the `wdjp/htmltest` Docker image
- Mounts your local `kubernetes/website` repository to the Docker image
- Scans the files generated in the `/public` directory and provides command line
  output when it encounters broken internal links

## What it does and doesn't check

The link checker scans generated HTML files, not raw Markdown.
The htmltest tool depends on a configuration file,
[`.htmltest.yml`](https://git.k8s.io/website/.htmltest.yml),
to determine which content to examine.

The link checker scans the following:

- All content generated from Markdown in
  [`/content/en/docs`](https://git.k8s.io/website/content/en/docs/) directory, excluding:
  - Generated API references, for example
    https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/
- All internal links, excluding:
  - Empty hashes (`<a href="#">` or `[title](#)`) and empty hrefs (`<a href="">` or `[title]()`)
  - Internal links to images and other media files

The link checker does not scan the following:

- Links included in the top and side nav bars, footer links, or links in a page's `<head>` section,
  such as links to CSS stylesheets, scripts, and meta information
- Top level pages and their children, for example: `/training`, `/community`, `/case-studies/adidas`
- Blog posts
- API Reference documentation, for example
  https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/
- Localizations

## Prerequisites and installation

You must install

* [Docker](https://docs.docker.com/get-docker/)
* [make](https://www.gnu.org/software/make/)

## Running the link checker

To run the link checker:

1. Navigate to the root directory of your local `kubernetes/website` repository.

2. Run the following command:

   ```shell
   make container-internal-linkcheck
   ```

## Understanding the output

If the link checker finds broken links, the output is similar to the following:

```
tasks/access-kubernetes-api/custom-resources/index.html
  hash does not exist --- tasks/access-kubernetes-api/custom-resources/index.html --> #preserving-unknown-fields
  hash does not exist --- tasks/access-kubernetes-api/custom-resources/index.html --> #preserving-unknown-fields
```

This is one set of broken links. The log adds an output for each page with broken links.

In this output, the file with broken links is `tasks/access-kubernetes-api/custom-resources.md`.

The tool gives a reason: `hash does not exist`. In most cases, you can ignore this.

The target URL is `#preserving-unknown-fields`.

One way to fix this is to:

1. Navigate to the Markdown file with a broken link.
1. Using a text editor, do a full-text search (usually Ctrl+F or Command+F) for the
   broken link's URL, `#preserving-unknown-fields`.
1. Fix the link. For a broken page hash (or _anchor_) link,
   check whether the topic was renamed or removed.

Run htmltest to verify that broken links are fixed.

Chunks
015e1f85 (1st chunk of `content/en/docs/doc-contributor-tools/linkchecker/README.md`)
Title: Using htmltest to Check Internal Links in the Kubernetes Website
Summary
This section describes how to use the `htmltest` tool to check for broken internal links in the generated HTML files of the Kubernetes website. It explains how the tool works, what it checks, what it doesn't check, prerequisites, installation, how to run the link checker, and how to understand the output to fix broken links. The tool helps in maintaining the integrity of internal links when content is refactored, moved, or renamed.