python程序设计作业 | 您所在的位置:网站首页 › python作业代做 › python程序设计作业 |
CS602留学生作业代做、代写Programming课程作业、代做Python语言作业、Python编程设计作业调试 日期:2019-12-06 10:50 CS602 - Data-Driven Development with Python Fall 2019 Programming Assignment 6 1 Programming Assignment 6 Getting started Review class handouts and examples, work on the reading and practice assignments posted on the course schedule. This assignment is designed to practice data manipulation with Pandas and plotting with Matplotlib. Programming Project: Plotting worth: 25 points Create a plot and barcharts visualizing hotel ratings. Data and program overview In this assignment you will be working with data on hotel reviews. The task will be to create a plot showing mean ratings and number of reviews for a selection of hotels in a chosen state, and a barchart that shows percentage of reviews. The following data will be provided using csv files: ? A table with information on hotel location (hotels.csv); we will call this the hotel data. ? A table with records of customer reviews of their stays in the hotels (hotelreviews.csv); referred to henceforth as the reviews data. Each review references the hotel name and city; these parameters uniquely identify the corresponding hotel in the hotel data. The files are supplied in a zip file, which will create a data subfolder, when unpacked. Review the data before you read the rest of this handout. Overview of the program The program should work as follows. 1. Ask the user for the subfolder and names of the two data files (see interaction). 2. Ask the user to enter a state, verifying that the state is one of the states for which hotel information is available in the hotel data. If user input for the state was not found in the appropriate column in hotel data, user input must be repeated, until a valid state is entered. 3. Identify all cities with a hotel in the state, based on the hotel data. Provide a numbered sequence of cities in the specified state and ask the user to enter up to four numbers from the list. Input should be repeated until the user enters one to four numbers from the numbered list. You may assume the user will be entering numbers only. 4. Identify all hotels that are located in the selected city(s) (you can assume city names are unique across all states). Display the names of these hotels. 5. Display a hotel reviews plot (described below), and save the plot as plot1.jpg file using plt.savefig() function. 6. For the three highest rated hotels among the selected, display a rating percentage barchart showing percentage of reviews with specific ratings (described below). Save these plots as barchart1.jpg, barchart2.jpg, barchart3.jpg, Sample interactions: user input appears in boldface with the generated plots shown after the text of the interaction. Please enter names of the subfolder and files: data hotels.csv hotelreviews.csv CS602 - Data-Driven Development with Python Fall 2019 Programming Assignment 6 2 Please enter state, e.g. MA: MA 1 Auburn 2 Boston 3 Brockton 4 Cambridge 5 Fitchburg 6 West Springfield dtype: object Select cities from above list by entering up to four indices on the same line: 2 4 You have selected the following cities: city 2 Boston 4 Cambridge Displaying rating information for the following hotels: name city province 0 The Inn @ St. Botolph Boston MA 1 40 Berkeley Hostel Boston MA 2 A Bed & Breakfast In Cambridge Cambridge MA 3 Holiday Inn Express Hotel and Suites Cambridge Cambridge MA Exiting... CS602 - Data-Driven Development with Python Fall 2019 Programming Assignment 6 3 The following interaction demonstrates how invalid input should be handled and the messages to be shown for invalid input (highlighted). The graphs are omitted. Please enter names of the subfolder and files: data hotels.csv hotelreviews.csv Please enter state, e.g. MA: Provence We have no data on hotels in Provence Please enter state, e.g. MA: az We have no data on hotels in az Please enter state, e.g. MA: AZ 1 Eloy 2 Glendale 3 Mesa 4 Payson 5 Phoenix 6 Prescott Valley 7 Tucson 8 Wellton dtype: object Select cities from above list by entering up to four indices on the same line: 5 6 8 10 Selection must range from 1 to 8 Select cities from above list by entering up to four indices on the same line: 1 2 3 4 5 6 7 You selected 7 items, must select up to four Select cities from above list by entering up to four indices on the same line: 5 7 You have selected the following cities: city 5 Phoenix 7 Tucson Displaying rating information for the following hotels: name city province 0 La Quinta Inn and Suites Tucson - Reid Park Tucson AZ 1 La Posada Lodge & Casitas, An Ascend Hotel Col... Tucson AZ 2 Residence Inn By Marriott Tucson Williams Centre Tucson AZ 3 Holiday Inn Express & Suites Phoenix Downtown ... Phoenix AZ 4 Park Terrace Suites Phoenix AZ Overview of plots 1. Hotel reviews plot is the first plot shown in the interaction. For each of the hotels in the selected cities, this plot visualizes the number of reviews as a coordinate on the x axis and the average rating, as a coordinate on the y axis. Hotels must be displayed as colored points, annotated with the hotel name, and (for full credit) using a color corresponding to a city, as shown in the plot legend. Axes must be clearly labeled as shown, and the title should be as shown. Do not worry about the hotel names overlapping due to placement of annotations, or extending beyond the plot boundaries. 2. Rating percentage barchart is generated for three of the top-rated hotels. Each barchart displays a bar graph produced using the Matplotlib function plt.bar(), showing what percentage of all reviews have the specific rating (1 through 5). This percentage is computed by calculating CS602 - Data-Driven Development with Python Fall 2019 Programming Assignment 6 4 the total number of reviews with the rating and dividing it by the total number of reviews for the hotel. The percentage value must be displayed clearly on top of the bar. Axes must be clearly marked and labeled as shown, and the title of the chart should include the name of the hotel, its city and state information as well as the total number of the reviews of the hotel. Required Functions Include ? main() to read the location of the input files and call other functions to run the whole program; ? function pickStateAndCities () that will run the state and city selection procedure and return user chosen state and all cities in it; ? function selectHotelReviews() to select and return reviews for the hotels in the selected cities, so that the Hotel reviews plot can be generated; ? functions reviewsRatingsPlot() and ratingPercentageBarchart() to generate and save the appropriate plots. Pick function parameters and return values as you see fit, and define other functions as needed. General Requirements ? You can assume that the provided files will have all of the columns involved in the required computations, but the number and content of records and order of columns may be different. ? Your program should have no code outside of function definitions, except for a single call to main() and global variables described in the next bullet. ? In order to make the code easier to modify for a different set of column names, define global variables that store the names of columns that your program uses (e.g. CITY = 'city') and use the global variables throughout your code. ? All file related operations must use device-independent handling of paths (use os.getcwd() and os.path.join() functions to create paths, instead of hardcoding them). Submission and Grading Submit your code along with the image files that your program will generate for the input data contained in the first sample interaction. Grading will be based on the accuracy (conforming to all the requirements and format of the interaction), generality of code and the appropriate use of pandas/numpy/matplotlib resources (data structures and functions). Two points will be awarded for programming style. Created by Tamara Babaian on November 23, 2019 |
CopyRight 2018-2019 实验室设备网 版权所有 |