Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Removing Duplicate Lines

Problem

After selecting and/or sorting some data you notice that there are many duplicate lines in your results. You’d like to get rid of the duplicates, so that you can see just the unique values.

Solution

You have two choices available to you. If you’ve just been sorting your output, add the -u option to the sort command:

$somesequence | sort -u

If you aren’t running sort, just pipe the output into uniq—provided, that is, that the output is sorted, so that identical lines are adjacent:

$somesequence > myfile
$ uniq myfile

Discussion

Since uniq requires the data to be sorted already, we’re more likely to just add the -u option to sort unless we also need to count the number of duplicates (-c, see Sorting Numbers), or see only the duplicates (-d), which uniq can do.

Warning

Don’t accidentally overwrite a valuable file by mistake; the uniq command is a bit odd in its parameters. Whereas most Unix/Linux commands take multiple input files on the command line, uniq does not. In fact, the first (non-option) argument is taken to be the (one and only) input file and any second argument, if supplied, is taken as the output file. So if you supply two filenames on the command line, the second one will get clobbered without warning.

Table of Contents for
8.5. Removing Duplicate Lines

Removing Duplicate Lines

Problem

Solution

Discussion

Warning

See Also

Table of Contents for 8.5. Removing Duplicate Lines

Create new playlist

Sign In

Sign Up

Removing Duplicate Lines

Problem

Solution

Discussion

Warning

See Also

Table of Contents for
8.5. Removing Duplicate Lines