Defining the Base62 algorithm

We saw how the Base62 algorithm works in the previous chapters. Here is the solid implementation of that algorithm. The logic is purely mathematical and can be found everywhere on the web. Take a look at the following code:

package base62

import (
     "math"
     "strings"
)

const base = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
const b = 62

// Function encodes the given database ID to a base62 string
func ToBase62(num int) string{
    r := num % b
    res := string(base[r])
    div := num / b
    q := int(math.Floor(float64(div)))

    for q != 0 {
        r = q % b
        temp := q / b
        q = int(math.Floor(float64(temp)))
        res = string(base[int(r)]) + res
    }

    return string(res)
}

// Function decodes a given base62 string to datbase ID
func ToBase10(str string) int{
    res := 0
    for _, r := range str {
        res = (b * res) + strings.Index(base, string(r))
    }
    return res
}

In the preceding program, we defined two functions called ToBase62 and ToBase10. The first one takes an integer and generates a base62 string, and the latter one reverses the effect; that is, it takes a base62 string and gives the original number. In order to illustrate this, let us create a simple program that uses both the functions to show encoding/decoding:

vi $GOPATH/src/github.com/narenaryan/usebase62.go

Add the following content to it:

package main

import (
      "log"
      base62 "github.com/narenaryan/base62"
)

func main() {
  x := 100
  base62String := base62.ToBase62(x)
  log.Println(base62String)
  normalNumber := base62.ToBase10(base62String)
  log.Println(normalNumber)
}

Here, we are using the functions from the base62 package and trying to see the output. If we run this program (from $GOPATH/src/github.com/narenaryan) using the following command:

go run usebase62.go

It prints:

2017/08/07 23:00:05 1C
2017/08/07 23:00:05 100

 base62 encoding of 100 is 1C. This is because the index 100 shrunk to 1C in our base62 logic:

const base = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

The original number will be used to map the character in this base string. Then, the number is divided by 62 to find out the next characters. The beauty of this algorithm is creating a unique, shorter string for every given number. We use this technique to pass a database ID into the ToBase62 algorithm and get a shorter string out. Whenever a URL shortening request comes to our server, it should perform the following steps:

  1. Store the URL in the database and get the ID of that record inserted.
  2. Pass this ID to the client as the API response.
  3. Whenever a client loads the shortened URL, it hits our API server.
  4. The API server then converts the short URL back to the database ID and fetches the record from the original URL.
  5. Finally, the client can use this URL to redirect to the original site.

We are going to write a Go project here that implements the preceding steps. Let us compose the program. I am creating a directory structure for our project. We take files from the preceding illustrations for handling encoding/decoding base62 and also for database logic. The directory structure looks like this:

urlshortener
├── main.go
├── models
│ └── models.go
└── utils
└── encodeutils.go

2 directories, 3 files

Copy this directory to $GOPATH/src/github.com/narenaryan. Once again, a small caution. Replace narenaryan with your username. Copy encodeutils.go and models.go from the preceding examples. Then, start writing the main program:

package main

import (
    "database/sql"
    "encoding/json"
    "io/ioutil"
    "log"
    "net/http"
    "time"

    "github.com/gorilla/mux"
    _ "github.com/lib/pq"
    "github.com/narenaryan/urlshortener/models"
    base62 "github.com/narenaryan/urlshortener/utils"
)

// DB stores the database session imformation. Needs to be initialized once
type DBClient struct {
  db *sql.DB
}

// Model the record struct
type Record struct {
  ID  int    `json:"id"`
  URL string `json:"url"`
}

// GetOriginalURL fetches the original URL for the given encoded(short) string
func (driver *DBClient) GetOriginalURL(w http.ResponseWriter, r *http.Request) {
  var url string
  vars := mux.Vars(r)
  // Get ID from base62 string
  id := base62.ToBase10(vars["encoded_string"])
  err := driver.db.QueryRow("SELECT url FROM web_url WHERE id = $1", id).Scan(&url)
  // Handle response details
  if err != nil {
    w.Write([]byte(err.Error()))
  } else {
    w.WriteHeader(http.StatusOK)
    w.Header().Set("Content-Type", "application/json")
    responseMap := map[string]interface{}{"url": url}
    response, _ := json.Marshal(responseMap)
    w.Write(response)
  }
}

// GenerateShortURL adds URL to DB and gives back shortened string
func (driver *DBClient) GenerateShortURL(w http.ResponseWriter, r *http.Request) {
  var id int
  var record Record
  postBody, _ := ioutil.ReadAll(r.Body)
  json.Unmarshal(postBody, &record)
  err := driver.db.QueryRow("INSERT INTO web_url(url) VALUES($1) RETURNING id", record.URL).Scan(&id)
  responseMap := map[string]interface{}{"encoded_string": base62.ToBase62(id)}
  if err != nil {
    w.Write([]byte(err.Error()))
  } else {
    w.Header().Set("Content-Type", "application/json")
    response, _ := json.Marshal(responseMap)
    w.Write(response)
  }
}

func main() {
  db, err := models.InitDB()
  if err != nil {
    panic(err)
  }
  dbclient := &DBClient{db: db}
  if err != nil {
    panic(err)
  }
  defer db.Close()
  // Create a new router
  r := mux.NewRouter()
  // Attach an elegant path with handler
  r.HandleFunc("/v1/short/{encoded_string:[a-zA-Z0-9]*}", dbclient.GetOriginalURL).Methods("GET")
  r.HandleFunc("/v1/short", dbclient.GenerateShortURL).Methods("POST")
  srv := &http.Server{
    Handler: r,
    Addr:    "127.0.0.1:8000",
    // Good practice: enforce timeouts for servers you create!
    WriteTimeout: 15 * time.Second,
    ReadTimeout:  15 * time.Second,
  }
  log.Fatal(srv.ListenAndServe())
}

First, we imported the postgres library and other necessary libraries. We imported our database session from the models. Next, we imported our encode/decode base62 algorithms to implement our logic:

// DB stores the database session imformation. Needs to be initialized once
type DBClient struct {
  db *sql.DB
}

// Model the record struct
type Record struct {
  ID  int    `json:"id"`
  URL string `json:"url"`
}

The DBClient is needed in order to pass the database driver between various functions. The record is the structure that resembles the record that gets inserted into the database. We defined two functions in our code called  GenerateShortURL and GetOriginalURL for adding the URL to the database and then fetching it back from DB respectively. As we already explained the internal technique of URL shortening, the client that is using this service will get the necessary response back. Let us run the program and see the output before jumping into further details:

go run $GOPATH/src/github.com/narenaryan/urlshortener/main.go

If your $GOPATH/bin is already in the system PATH variable, we can first install the binary and run it like this:

go install github.com/narenaryan/urlshortener/main.go

And then just the program name:

urlshortener
It is a best practice to install the binary because it is available systemwide. But for smaller programs, we can run main.go by visiting the directory of the program.

Now it runs the HTTP server and starts collecting requests for the URL shortening service. Open the console and type these CURL commands:

curl -X POST 
http://localhost:8000/v1/short
-H 'cache-control: no-cache'
-H 'content-type: application/json'
-d '{
"url": "https://www.forbes.com/forbes/welcome/?toURL=https://www.forbes.com/sites/karstenstrauss/2017/04/20/the-highest-paying-jobs-in-tech-in-2017/&refURL=https://www.google.co.in/&referrer=https://www.google.co.in/"
}'

It returns the shortened string:

{
"encoded_string": "1"
}

The encoded string is just "1". Base62 algorithms start allocating shorter strings starting from one to a combination of alphanumeric letters. Now, if we need to retrieve the original URL we can perform a GET request:

curl -X GET 
http://localhost:8000/v1/short/1
-H 'cache-control: no-cache'

It returns the following JSON:

{   
"url":"https://www.forbes.com/forbes/welcome/?toURL=https://www.forbes.com/sites/karstenstrauss/2017/04/20/the-highest-paying-jobs-in-tech-in-2017/u0026refURL=https://www.google.co.in/u0026referrer=https://www.google.co.in/"}

So, the service can use this result to redirect the user to the original URL (site). Here, the generated string doesn't depend on the length of the URL because only the database ID is the criteria for encoding.

The RETURNING keyword needs to be added to the INSERT SQL command in PostgreSQL to fetch the last inserted database ID. This is not the case with MySQL or SQLite3 INSERT INTO web_url( ) VALUES($1) RETURNING id, record.URL. This DB query returns the last inserted record's ID. If we drop that RETURNING keyword, the query returns nothing.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.254.118