by royalghost on 1/24/25, 10:29 PM with 19 comments
We are facing an intermittent issue in our web application where for some users for some reasons http requests are ending in error ( 400s ) esp. during token refresh with authentication server.
Normally, we would ask user to generate the HAR ( HTTP archive file ) and we inspect to find the root cause. However, at this time it is challenging to collect the HAR file manually because the error is not consistent. Sometimes it seems to goes away but suddenly appears causing bad user experience.
It is also hard to add logs etc. because the token refresh happens on the client side from the browser so technically there is no traces of it on the server side.
I am looking into ways to automate generating the HAR file but it seems not straightforward to do it.
If anyone of you have faces similar issue in the past and find a way to add such error logging in a web service let me know. Any other thoughts and suggestions are highly appreciated.
Thank you in advance.
by lolinder on 1/26/25, 1:57 PM
I've seen HAR files containing Google account session tokens attached in plain text to Jira tickets. If you end up leaking those tokens your customers will not be amused.
See the Okta breach:
https://www.rezonate.io/blog/har-files-attack-okta-customers...
by smittywerben on 1/26/25, 3:23 PM
I'd sooner be testing in a lab environment recording a pcap file on both sides to try to get the client's TLS session to break before I'd want a client's confidential credential flow sent to me. I don't like to bother people. I've always hated refresh tokens, at least OAuth's design of them. Is sending a client's decrypted MITM logs around really safer?
by alp1n3_eth on 1/26/25, 3:17 PM
Echoing some other suggestions, but to a different extent, increase logging in the problem areas both client-side and server-side. It might be directly related to the token refresh since it only happens there, so a great place to start is within that functionality. Log the entire connection's info to both services (front and back logging) and if users are manually submitting tickets you should be able to track them down by userID / IP in the logs.
Also extend the fuzzing capabilities w/ your tests through browser (potentially could be headless, depending on the issue) automation that authenticates and uses the app "normally". Keep it on repeat using the app and when token refresh time comes see if the error pops up. Throw some extra variables in their, ensure its off the corporate network or routed through DCs farther away to see if it's a latency issue somewhere else. You could log the HAR file for this.
Multiple versions of tests might need to be run in parallel with different modifiers, such as one being allowed to directly communicate w/ the origin, vs. another going through the CDN like a standard customer would.
This is also an edge-case, but I've seen it popup sometimes; ensure that there aren't any other required variables that are missing during the refresh process. Sometimes specific functionality in some apps is tied to a custom header, and sometimes the value isn't updated to what the app expects. Things like that which could throw the process of from another angle.
by solardev on 1/26/25, 4:31 AM
by davidt84 on 1/26/25, 9:42 AM
by geocar on 1/26/25, 2:10 PM
Also, do you actually need the HAR file? or just a log of your servers' inputs/outputs from the clients' perspective? You can get that The Boring Way if you don't have a CSP issue, so maybe solve that issue?
by dewey on 1/26/25, 9:36 AM
by phrotoma on 1/26/25, 2:51 PM
by viraptor on 1/26/25, 11:28 AM
You can totally add logging for that. If you don't have an existing service that can handle it, you can create a logging-only endpoint for that purpose and send the event async to not block other work.
by Zanfa on 1/26/25, 1:46 PM
by sim7c00 on 1/26/25, 10:23 AM
as some other commenter said, automating har files might not be ideal as it could collect much too much info, and browsers will make this very difficult to automate.
perhaps you cam add client side logging and automate gathering that or ask users for that rather than a har file. like if xyz happens again please send us log from location yzw. not sure if that is possible but it would atleast unburden users from runing devtools on an intermittent issue. if it happens only to few users you can add it optionally to their clientside like a debug/trace mode. if it happens widespread id say add it for all users.
good luck and happy to see ur not giving up just yet :D these issues can be quite frustrating to get good data on. keep at it and ull find it eventually.
it might also be possible to automate a client at your own side and run it until it hits the issue. no guarantee it will actually hit it though. you can run it from office, home, and try to have many colleagues / people run it in different (maybe personal) setups.
by new_user_final on 1/26/25, 2:23 PM
by mariogintili on 1/26/25, 12:50 PM
by moltar on 1/26/25, 2:40 PM