We've used estimators before in StringIndexer. We've already stated that estimators somehow contain state that changes while looking at data, whereas this is not the case for transformers. So why is StringIndexer an estimator? This is because it needs to remember all the previously seen strings and maintain a mapping table between strings and label indexes.
Another easy way to distinguish between an estimator and a transformer is the additional method called fit on the estimators. Fit actually populates the internal data management structure of the estimators based on a given dataset, which, in the case of StringIndexer, is the mapping table between label strings and label indexes. So now let's take a look at another estimator, an actual machine learning algorithm.