Profiling web endpoints for fun and profit
Overview
At VXSoft we treat performance as a feature and try to address issues related to it with extreme seriousness. Our products are mainly web applications, thus affected by all performance (e.g. traffic spikes, poorly written code, database performance) and security issues (OWASP top ten) of general web applications living in the wild. Applications have different deployment types, varying from single cloud to on site isolated installations.
The isolated installations are hard to debug and monitor, thus the difficulties with detecting reasons for occasional performance degradation cases. So we needed a solution and preferably a reusable one to detect problematic web endpoints causing this degradation, after which the engineering team would be able to narrow down the specter of possible investigation subjects.
Logs
Some installations of our systems are isolated from the public network and are even inaccessible to our customers' support team, which was the main reason that real time web application monitoring tools wouldn't suite us. So the decision our team ended up with was to collect logs on those installations and process them later on our side. For that goal, we defined following steps:
Log format
For each request/endpoint following information is logged
into a single log file:
{timestamp in ISO format};{HTTP method};{URI};{elapsed}
Fig. 2. Defined log format
The defined log format contains only the minimal necessary information for identifying slow endpoints, also collecting interesting insights about application usage, ratio of HTTP methods, etc. An important aspect is that the log doesn't contain any personal or confidential information of our clients.
After enabling logging in the application's
configuration file, the system starts collecting estimation time log of all
endpoints resulting in the following CSV log file:
2021-10-01T12:06:41.009819;GET;/?;0.0012438558042049408
2021-10-01T12:06:41.043376;GET;/am/dashboard/tab/1?;0.057040732353925705
2021-10-01T12:06:41.018917;GET;/am/widgets/5/render?task_id=1306383&incoming_id=0&outgoing_id=0&internal_id=1306383&tab_id=263;0.04639996960759163
...
Fig. 3. Log sample
Log collector
Our web applications are
Python WSGI applications,
which use
uWSGI as an
application server and stand behind an NGINX reverse proxy. We decided to collect defined metrics only for our web
application by skipping the web server and application server. It gave us an opportunity to estimate the performance of our system by ignoring network
latency and other third party dependencies. Also it was mandatory for log
collector to be web framework agnostic as we use different frameworks for
our applications. For example, here is the integration of perflog log
collector with the wheezy.web
framework:
application = WSGIApplication(
[
bootstrap_defaults(url_mapping=URLS),
http_error_middleware_factory,
path_routing_middleware_factory
],
options=config.options
)
# enable performance logging if configuration is set
if config.config.getboolean('logging', 'perf-log', fallback=False):
application = perflog(application, 'var/log/perflog.csv')
Fig. 4. Enabling performance logging for Wheezy.Web
As it can be seen from fig. 4, log collector simply wraps existing WSGI
application, thus making it universal for other compliant frameworks. Making
it configurable allows the application's owner to control log collection duration.
import atexit
from time import perf_counter
from datetime import datetime
def perflog(app: callable, logfile: str) -> callable:
fd = open(logfile, 'a')
atexit.register(lambda: fd.close())
def wsgi_application(env: dict, start_response: callable):
elapsed = perf_counter()
result = app(env, start_response)
elapsed = perf_counter() - elapsed
fd.write(
''.join((
datetime.now().isoformat(),
';',
env['REQUEST_METHOD'],
';',
f"{env['PATH_INFO']}?{env['QUERY_STRING']}",
';',
str(elapsed),
'\n',
))
)
fd.flush()
return result
return wsgi_application
Fig. 5. WSGI application wrapper for performance log collection
Fig. 4 presents the source code of WSGI application wrapper, which collects logs in defined format for each request. It doesn't affect application performance by itself and is very simple for maintenance.
Log processing
After collecting logs for some time and receiving them from
clients (fig. 1), the next step is processing them. As the defined log format is
custom, we decided to built a small custom utility to parse and visualize
the data, for which we decided to go with Jupyter notebook for its ability of
quick prototyping and support of powerful visualization tools. It is
also available to export composed notebook as an HTML file providing decent
reporting functionality.
Fig. 6. Performance logging report example
In the notebook (fig. 6) pandas is used for data processing with a combination of Plotly for
displaying interactive and beautiful charts.
a) Top slowest requests
Fig. 7. Performance report analysis
By using our logging and processing technique we have been able to identify top slowest requests (fig. 7. a) over time, distribution of HTTP methods (fig. 7. b) and other interesting insights. By isolating slow endpoints we can then start digging further in that direction in order to find the true causes of slow performance.
Conclusion
Performance of a software is a critical
and integral part of it. During an application's lifetime its performance
can both worsen and improve depending on new futures, proper monitoring
etc. But identification of the true origin of performance degradation is a very
hard problem itself.
Due to the specifics of our clients and applications we are
developing and supporting, in many cases we don't have access to the
deployment, thus making it very hard to monitor or collect any runtime
information. That's why we have developed our own solution for web
application's endpoints' runtime estimation collection and analyzing. We
were able to use that information and reports based on it to isolate
endpoints causing performance degradation for further
investigations.
Profiling web endpoints for fun and profit
Reviewed by Vladimir Poghosyan
on
16:45
Rating: 5
