Mohammad Khorasani1, Mohamed Abdou2 and Javier Hernández Fernández1
(1)
Doha, Qatar
(2)
Cambridge, United Kingdom
In order to develop more advanced Streamlit applications, it is vital to establish session-specific data that can be utilized to deliver a more enhanced experience to the user. Specifically, the application will need to preserve the user’s data and entries using what is referred to as session states. These states can be set and accessed on demand whenever necessary, and they will persist whenever the user triggers a rerun of the Streamlit application or navigates from one page to another. In addition, we will establish the means to store state across multiple sessions with the use of cookies that can store data on the user’s browser to be accessed when they restart the associated Streamlit application. Finally, we will learn how to record and visualize rich insights of how users are interacting with our application, to provide analytics to both the developer and product owner alike.
7.1 Implementing Session State Natively
Since Streamlit version 0.84.1, a native way to store and manage session-specific data including but not limited to variables, widgets, text, images, and objects has been introduced. The values of session states are stored in a dictionary format, where every value is assigned to a unique key to be indexed with. Previously without this feature, all variables would be reset whenever the user triggered Streamlit to rerun the script by interacting with the application. Similarly, widgets would also be reset to their default value when the user navigated from one page to another. However, with session state, users can receive an enhanced and more personalized experience by being able to access variables or entries that were made previously on other pages within the application. For instance, users can enter their username and password once and continue to navigate through the application without being prompted to reenter their credentials again until they log out. In a nutshell, session states enable us to develop far more complex applications which will be discussed extensively in subsequent chapters.
The way to set and get session state data can be implemented as shown in Listing 7-1 with the associated output in Figure 7-1. Please note that the first two key-value entries (KeyInput1 and KeyInput2) are present even though they haven’t been created by the user. Those keys are present to store the state of the user-modified components, which are the defined text input components. This means that the developer also has the capability of modifying the values of any component as long as it has a unique key set with its definition. Another caveat is that each session state must be invoked before it can be read; otherwise, you will be presented with an error. To avoid this, ensure that you always initialize the state with a null or fault value.
import streamlit as st
def get_state_value(key):
return st.session_state.get(key)
def set_state_value(key, value):
st.session_state[key] = value
st.title("Session State Management")
c1, c2, c3 = st.columns(3)
with c1:
st.subheader("All")
st.write(st.session_state)
with c2:
st.subheader("Set Key")
key = st.text_input("Key", key="KeyInput1")
value = st.text_input("Value")
if st.button("Set"):
st.session_state[key] = value
st.success("Success")
with c3:
st.subheader("Get Key")
key = st.text_input("Key", key="KeyInput2")
if st.button("Get"):
st.write(st.session_state.get(key))
Listing 7-1
session_state.py
7.1.1 Building an Application with Session State
To demonstrate the utility of session states, in the following example we will create a trivial multipage application where the user can use states to store the key of the selected page, an uploaded dataframe, and the value of a slider widget. As shown in Listing 7-2, in our main page we first initialize the state of our page selection, and then we use buttons to change the state to the key of the requested page. Subsequently, the associated function of the selected page is invoked directly from the session state to render the page.
In Page One of the application shown in Listing 7-3, we will use session states to store an uploaded dataframe and the value of a slider that is used to filter the number of rows shown in the dataframe. The user can navigate back and forth between the pages and still be able to access a previously uploaded dataframe with the same number of rows set on the slider as shown in Figure 7-2.
import streamlit as st
from page_1 import func_page_1
def main():
# Initializing session state for page selection
if 'page_state' not in st.session_state:
st.session_state['page_state'] = 'Main Page'
# Writing page selection to session state
st.sidebar.subheader('Page selection')
if st.sidebar.button('Main Page'):
st.session_state['page_state'] = 'Main Page'
if st.sidebar.button('Page One'):
st.session_state['page_state'] = 'Page 1'
pages_main = {
'Main Page': main_page,
'Page 1': run_page_1
}
# Run selected page
pages_main[st.session_state['page_state']]()
def main_page():
st.title('Main Page')
def run_page_1():
func_page_1()
if __name__ == '__main__':
main()
Listing 7-2
main_page.py
import streamlit as st
import pandas as pd
def func_page_1():
st.title('Page One')
# Initializing session states for dataframe and slider
if 'df' not in st.session_state:
st.session_state['df'] = None
if 'rows' not in st.session_state:
st.session_state['rows'] = None
file = st.file_uploader('Upload file')
# Writing dataframe to session state
if file is not None:
df = pd.read_csv(file)
st.session_state['df'] = df
if st.session_state['df'] is not None:
# Creating slider widget with default value from session state
rows = st.slider('Rows to display',value=st.session_state['rows'], min_value=1,max_value=len(st.session_state['df']))
Session IDs are unique identifiers of a new connection to Streamlit’s HTML serving WebSocket. A new WebSocket connection is established if a new browser page is opened even if another connection is established. But both are treated independently by the server. These WebSocket connections are used to transfer data from the server to the client’s browser and vice versa.
Those unique identifiers can be used to provide the end user with a personalized experience. And to do so, the server needs to map users’ progress and updates to their corresponding identifiers. This mapping can be done natively in an effortless way using Streamlit’s native API. In Listing 7-4, we show how to get the current session’s ID or all the active session IDs, with the output shown in Figure 7-3, which shows two web pages both at http://localhost:8501/.
import streamlit as st
from streamlit.scriptrunner.script_run_context import get_script_run_ctx
Streamlit’s native method to store session state will more than suffice for most if not all applications. However, it may be necessary to store session state on an accessible database to retrieve later on or, in other words, store session state persistently. This can be especially useful for generating user insights and traffic metrics for your application.
For the example shown in Listing 7-5, we will be making use of PostgreSQL to store and retrieve a variable and a dataframe while our user navigates through the pages of the application as shown in Figures 7-4 and 7-5. Specifically, the entered name and uploaded dataset will be written to the database in tables named with the unique session ID, as shown in Figure 7-6, and read/rendered with the previous value each time the user refers back to Page Two, as long as the user is still within the same session. Once the application is refreshed and a new session ID generated, the user will no longer have access to the variables; however, the database administrator can access the historical states in the database should they need to do so. Given that Streamlit reruns the script with every user interaction, without session state both the name and dataframe would be reset each time the script is run; however, with this implementation, we can save the data persistently and access it in the associated database on demand.
This method is scalable and can be extended to as many variables, and even files (in the form of byte strings) if required, using the four read and write functions shown as follows:
Having the ability to record user interactions with a web application can in many instances be quite critical. The developer or product owner must have at their disposal rich and accurate data of how many users are visiting their website, at what times, and how exactly they are engaging with it in order to better tailor and refine their product or service. Imagine you have an ecommerce web application that has been curated impeccably, but is failing to convert leads into sales, and you cannot figure out why. Perhaps, there is a bug with your interface or backend that is inhibiting the user’s actions, or maybe your server is overloaded and is failing to cater to the traffic. Regardless, you will need to diagnose where exactly in your pipeline the problem is located to rectify it, and this is where user insights come into play.
While Google Analytics can deliver rich insights including but not limited to numbers of visits, user demographics, time spent on various pages, and a lot more at the server level, it will not readily be able to record interactions at the application level. Consequently, you are required to develop your own means of recording user insights for the application, and this can be done in several ways. The simplest of which, as shown in Listing 7-6, allows you to read the timestamp each time the user engages with a subsection of your code, such as clicking a button or uploading a dataset as shown in Figure 7-7, and record it in a PostgreSQL database as seen in Figure 7-8. Similarly, the number of rows of the uploaded dataset can also be recorded. Each insight is stored in a separate column in a table whose primary key is the unique session ID. Once the application has been restarted, a new row with a different session ID will be created. By inserting the following updatefunction, you can record any value at any step in your program:
Please note that insights can be overwritten multiple times by setting the mutableargument to True or left as False if you want to record a value only the first time it was generated.
import streamlit as st
from streamlit.scriptrunner.script_run_context import get_script_run_ctx
from datetime import datetime
import pandas as pd
import psycopg2
from sqlalchemy import create_engine
def get_session_id():
session_id = get_script_run_ctx().session_id
session_id = session_id.replace('-','_')
session_id = '_id_' + session_id
return session_id
def insert_row(session_id,engine):
if engine.execute("SELECT session_id FROM user_insights WHERE session_id = '%s'"
% (session_id)).fetchone() is None:
engine.execute("""INSERT INTO user_insights (session_id) VALUES ('%s')"""
Now that we have established how to read insights from a Streamlit application and record them on a PostgreSQL database, the next step will be to visualize the data on demand. Initially, to extract the data we can run Listing 7-7 to import the insights table into a Pandas dataframe and save it locally on disk as an Excel spreadsheet if you wish to render your own customized charts. Otherwise, we can visualize the data with Listing 7-8, where we import the Excel spreadsheet generated earlier into a Pandas dataframe, convert the timestamps into hourly and daily values, sum the number of rows that are within the same hour or day, and finally visualize them with Plotly charts as shown in Figure 7-9. In addition, we can filter the data by using a st.selectbox to select the column from the insights table to visualize.
We have discussed how to store and manage data within a session using native and workaround approaches. However, what might be missing is the ability to manage data between sessions. For instance, storing a counter of how many times a button has been clicked or even more usefully not prompting the user to log in each and every time they open a new session. To do so, we require the utility of cookies.
Cookies can be used to track a user’s actions across many websites or store their personal information such as authentication tokens. Cookies are stored and managed on the user’s end, specifically, in their browser. This means the server doesn’t know about its content by default. In order to check out the cookies on any web application, we can simply open developer tools from the browser and head to the console tab. Then type 'document.cookie', and then the cookies will be displayed as shown in Figure 7-10.
In a typical Streamlit application, more unknown cookies will appear apart from the one in Figure 7-10; those might be for advertisement tracking or other purposes. Which might require the developer to remove them depending on the cookie policy they adopt. Or in other cases, the developer might want to add other cookies to enhance the application experience. Both actions need a way to manage cookies on any web app.
To manipulate cookies from a Streamlit application, we need to use a third-party module or library to make this happen. For this example, we will use Extra-Streamlit-Components which can be installed with pip install extra-streamlit-components and an import naming convention as stx, where the X denotes the extra capabilities that can be provided on a vanilla Streamlit application. Within this library, there is a module called Cookie Manager which will be our tool for such task. Listing 7-9 builds a simple Streamlit application with the capability of setting, getting, and deleting cookies. The controls are even customizable based on the developer needs. For instance, an expiration date can be set to any new cookie added which will autodelete itself after the set date is reached. Figures 7-11 and 7-12 show adding and getting an example authentication token, respectively.
import streamlit as st
import extra_streamlit_components as stx
st.title("Cookie Management Demo")
st.subheader("_Featuring Cookie Manager from Extra-Streamlit-Components_")
cookie_manager = stx.CookieManager()
st.subheader("All Cookies:")
cookies = cookie_manager.get_all()
st.write(cookies)
c1, c2, c3 = st.columns(3)
with c1:
st.subheader("Get Cookie:")
cookie = st.text_input("Cookie", key="0")
clicked = st.button("Get")
if clicked:
value = cookie_manager.get(cookie)
st.write(value)
with c2:
st.subheader("Set Cookie:")
cookie = st.text_input("Cookie", key="1")
val = st.text_input("Value")
if st.button("Add"):
cookie_manager.set(cookie, val)
with c3:
st.subheader("Delete Cookie:")
cookie = st.text_input("Cookie", key="2")
if st.button("Delete"):
cookie_manager.delete(cookie)
Listing 7-9
cookie_management.py
Please note that the All Cookies section in Figures 7-11 and 7-12 is displayed in a well-structured JSON format but has some redacted cookies for privacy concern purposes. Good to note that there is not a visual aspect to this Streamlit application from the newly introduced module as it is classified as a service, hence the name Cookie Manager. However, this doesn’t mean all other Streamlit-compatible libraries share the same behavior, as they can contain a visual aspect to them.
7.6 Summary
In this chapter, we were introduced to storing and accessing session states with Streamlit natively. The use of session state can be essential in many cases and will be utilized extensively to develop advanced applications in subsequent chapters. In addition, the reader was acquainted with session IDs that are unique identifiers associated with every new instance of a Streamlit application and shown how they can be used to store session state in a persistent manner on a PostgreSQL database and also how they can be utilized to record and visualize user insights. Finally, the reader was shown how to store and retrieve cookies on a browser, which is required to store session state across multiple sessions of a Streamlit application on the same browser. This is necessary for instances such as when the user wants to automatically log in without re-entring their credentials .