- S
- In TikTok documentation space, some technical and business note are written in Chinese by the China team, where we need to rely on machine translations, which may be inaccurate from time to time.
- T
- My task is to derive a efficient solution for this issue
- A
- The first thing I though of is to construct a web crawler for internal doc space, identify target documents that requires translation, then send message to doc owner to modify them
- But I soon realised that normal web crawler is too inefficient. Which took over 50 mins for our team's document
- Since we have to use this solution repeatedly on large scale, more efficient solution is needed.
- After detailed research, I came up with a solution of making a multi-thread Golang http crawler, which can improve the speed significantly by doing the crawling concurrently.
- R
- Have increase the crawling efficiency by 10 times, and now this program is actively used as a internal tool at TikTok