Go语言学习笔记 | 您所在的位置:网站首页 › 爬虫工程师 › Go语言学习笔记 |
Web Crawler
练习题目: In this exercise you'll use Go's concurrency features to parallelize a web crawler. Modify the Crawl function to fetch URLs in parallel without fetching the same URL twice. Hint: you can keep a cache of the URLs that have been fetched on a map, but maps alone are not safe for concurrent use! 练习程序: cher map[string]*fakeResult type fakeResult struct { body string urls []string } func (f fakeFetcher) Fetch(url string) (string, []string, error) { if res, ok := f[url]; ok { return res.body, res.urls, nil } return "", nil, fmt.Errorf("not found: %s", url) } var url_counter = SafeCounter{v: make(map[string]int)} // fetcher is a populated fakeFetcher. var fetcher = fakeFetcher{ "https://golang.org/": &fakeResult{ "The Go Programming Language", []string{ "https://golang.org/pkg/", "https://golang.org/cmd/", }, }, "https://golang.org/pkg/": &fakeResult{ "Packages", []string{ "https://golang.org/", "https://golang.org/cmd/", "https://golang.org/pkg/fmt/", "https://golang.org/pkg/os/", }, }, "https://golang.org/pkg/fmt/": &fakeResult{ "Package fmt", []string{ "https://golang.org/", "https://golang.org/pkg/", }, }, "https://golang.org/pkg/os/": &fakeResult{ "Package os", []string{ "https://golang.org/", "https://golang.org/pkg/", }, }, }运行结果: found: https://golang.org/ "The Go Programming Language" not found: https://golang.org/cmd/ found: https://golang.org/pkg/ "Packages" found: https://golang.org/pkg/os/ "Package os" found: https://golang.org/pkg/fmt/ "Package fmt"学习笔记: 本题目重点要掌握GO中goroutines之间同步锁的用法,对于所有线程都要访问的SafeCounter中的Map数据进行加锁保护,确保同一时间只有一个goroutine能访问并赋值。 |
今日新闻 |
推荐新闻 |
专题文章 |
CopyRight 2018-2019 实验室设备网 版权所有 |