Go语言学习笔记 您所在的位置:网站首页 爬虫工程师 Go语言学习笔记

Go语言学习笔记

2023-01-09 07:35| 来源: 网络整理| 查看: 265

Web Crawler

练习题目:

In this exercise you'll use Go's concurrency features to parallelize a web crawler.

Modify the Crawl function to fetch URLs in parallel without fetching the same URL twice.

Hint: you can keep a cache of the URLs that have been fetched on a map, but maps alone are not safe for concurrent use!

练习程序:

cher map[string]*fakeResult type fakeResult struct { body string urls []string } func (f fakeFetcher) Fetch(url string) (string, []string, error) { if res, ok := f[url]; ok { return res.body, res.urls, nil } return "", nil, fmt.Errorf("not found: %s", url) } var url_counter = SafeCounter{v: make(map[string]int)} // fetcher is a populated fakeFetcher. var fetcher = fakeFetcher{ "https://golang.org/": &fakeResult{ "The Go Programming Language", []string{ "https://golang.org/pkg/", "https://golang.org/cmd/", }, }, "https://golang.org/pkg/": &fakeResult{ "Packages", []string{ "https://golang.org/", "https://golang.org/cmd/", "https://golang.org/pkg/fmt/", "https://golang.org/pkg/os/", }, }, "https://golang.org/pkg/fmt/": &fakeResult{ "Package fmt", []string{ "https://golang.org/", "https://golang.org/pkg/", }, }, "https://golang.org/pkg/os/": &fakeResult{ "Package os", []string{ "https://golang.org/", "https://golang.org/pkg/", }, }, }

运行结果:

found: https://golang.org/ "The Go Programming Language" not found: https://golang.org/cmd/ found: https://golang.org/pkg/ "Packages" found: https://golang.org/pkg/os/ "Package os" found: https://golang.org/pkg/fmt/ "Package fmt"

学习笔记:

本题目重点要掌握GO中goroutines之间同步锁的用法,对于所有线程都要访问的SafeCounter中的Map数据进行加锁保护,确保同一时间只有一个goroutine能访问并赋值。



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

      专题文章
        CopyRight 2018-2019 实验室设备网 版权所有