C语言作业辅导 – 计算机组织和系统代写 – assignment代写 – CSE30
C语言作业辅导

C语言作业辅导 – 计算机组织和系统代写 – assignment代写 – CSE30

Assignment  3 (CSE30: Computer Organization and Systems)

 

C语言作业辅导 The movement of data from a source system to a destination system often requires the data to be reformatted to ···

 Introduction: Programming in C for Building a Simple ETL program  C语言作业辅导

 The movement of data from a source system to a destination system often requires the data to be reformatted to meet the input requirements of the destination systems. Source systems range from sensor data from a hardware system (like solar panel operating data) to rows extracted from a database system. Target systems can be other devices or a different database system. In the world of enterprise data management this data movement process is broadly called ETL, Extract, Transform and Load processing. ETL software may be purchased from vendors but is often custom written to meet the unique requirements of the output data format of the source systems and the input data format of the target system.

Extract processing (reads) data from standard input system (directly from the source system or a “spool” file created by the source system). Transform processing works on the input data and may include operations to validate, modify and reordering the data input. Load processing (writes) the transformed data to the target system via standard out.

A common format for moving data between systems is called a CSV file.  C语言作业辅导

comma-separated values (CSV) file uses a comma (‘,’) to separate values on a line of data. Each line of the file, or “data record”, is terminated by a newline (‘\n’). Each record consists of one or more data fields (columns) separated by commas. Fields are numbered from left to right starting with column 1. The use of the comma as a field separator is the source of the name for this data format. A CSV file stores tabular data (numbers and text) in plain text (ASCII strings). In a proper CSV file, each line will always have the same number of fields (columns).

In the example above we have a sample of a Call Detail Record in CSV format. A CDR file describes a cell phone voice call from one phone to another phone. Each record has 10 fields or columns. The first record of the file is a label for that column (field).

Each column can be empty, and the last column is never followed by a ‘,’. It always ends with a ‘\n’ for every line record.

An empty Call Detail Record would have nine (9) commas in it as shown below.

,,,,,,,,,

For the purpose of this assignment, assume the following of the input data:    C语言作业辅导

  • TheCSV file is ASCII text based and can be edited by any text 
  • Everyline ends with a ‘\n’.
  • Allrecords have the same number of data 
  • Adata field can be empty only when there is more than 1 field in each 

For example, a 3 field CSV has the following variations

  • 1,2,3
  • Iam a string, another string,5
  • ,,4
  • ,,

Spaces (or lack of spaces) in a field are to be preserved in this assignment.

In this assignment, you will write a program that reads CSV data from standard input, and writes modified CSV file to standard output. In the description, record columns (fields) are numbered

from 1 being the leftmost column, to N being the last column. (N is also the number of columns in a single record.)

Requirements:

  1. ThisETL program requires a single command line flag, -c that must be followed by an unsigned integer option argument that shall be one (1) or larger. This is the number of columns (fields) that every valid record in the input data (read from stdin) must have.
  2. After the input column specification, there must be one or more additional column numbers (unsigned integer 1 or greater). Each of these arguments specify one of the input record columns, so they must range in value from 1 to the number of columns (fields) in the input file.
  3. Theorder of the fields in the argument list specifies the order the input records are in the output CSV record that is written to standard output (stdout).
  4. Allusage and error messages are written to stderr.
  5. Ifthe mandatory -c flag is missing, the usage is to be printed to stderr and the program exits with EXIT_FAILURE as a return value from the program.
  6. Successfulprocessing will return EXIT_SUCCESS from the
  7. Youmust use getopt() to parse the command line

Usage is

./cnvtr -c input_column_count col# [col#…]

Sample Examples : C语言作业辅导

Example #1

Given an input file with 4 columns containing the following 3 records of data:

10,20,30,40

a,b,c,d

this is input,more input,3,last input

Calling the program as:

./cnvtr -c 4 4 3 2 1 < input_file > output_file

It says to read an input CSV file where each record has 4 columns. The output specification is to write a CSV file where each output record has a column order of 4,3,2,1 from each input record.

The columns above are in order 1,2,3,4. For e.g.: in the case of 10,20,30,40 – 10 (column 1) is entry 0 in the array.

The output will contain three records that look like:

40,30,20,10

d,c,b,a

last input,3,more input,this is input

You do not need to parse “<“ and “>” from the command line in your program. These are redirection operators and are parsed by the shell before the arguments even arrive in your program.

Example #2  C语言作业辅导

Calling the program, cnvtr with the same input but as:

./cnvtr -c 4 3 < input_file > output_file

It says to read a CSV file with 4 columns and only write column 3 of the input file to the output CSV file.

The output will contain three records that look like:

30

c 3

Example #3

Some CSV files want to allow the fields to contain commas. To incorporate this functionality, each field can optionally be enclosed in quotes.

Given an input CSV file of four records could look like: 1,2,”test,string”,4

When used as

./cnvtr -c 4 3 4

Output is

“test,string”,4

Note: There will not be any inputs with unmatched quotes. For e.g.: 1,2,”test,string,4 is an illegal input. A quote will never be part of a string (e.g., there is no escape character).

Description:

  1. [8 points] Write a C program that reads CSV datafrom standard input and writes a CSV file to standard output.
  2. [2points] Handling of CSV files that include

Part 1:  C语言作业辅导

Step 0: Getting started

We’ve provided you with a Makefile here. Download the Makefile to use.

  1. Download to get the Makefileand README.
  2. Fillout the fields in the README before turning
  3. Openyour  favorite  text  editor or IDE of choice and create new files named

converter.c and converter.h .

* The files you turn in must be named as specified or else the autograder and Makefile provided will not work.

 The Makefile provided will create a cnvtr executable from the converter.c file you will write. Run it by typing make into the terminal. By default, it will run with warnings turned on. Run make no_warnings to run without warnings. Run make clean to remove files generated by make.

Step 1: Setup  C语言作业辅导

 Setup 1: Parse the command line options make sure you get the -c col.

 This col will be the number of fields in the input file. On the command line there must be one or more additional args. These args specify field numbers in the input file. You will use this value to validate that the correct number of fields (columns) are in each line input record. You will need to verify all the specified args for output range in value from 1 to the number of fields in each input record (the value associated with the -c flag). If there is an error, print the error message to stderr alone with usage and exit (EXIT_FAILURE).

Setup 2: Build two arrays, one for input processing and one for output processing.

 For input, you will need to read a line of input using getline() (see man 3 getline) and break it into tokens. Each token is delimited by either a ‘,’ or a newline ‘\n’ in the input record buffer. Using malloc(), allocate an array of pointers to chars. The number of array entries is the same as the number of fields in the input file. Each entry will point to the start of each field. This array is similar to *argv[], except we do not need to null terminate it with an empty entry.

As an example, say we are processing records that have 10 input fields (like the CDR record above), the input processing array will look like this.

For output, you will need to create an array of int using malloc(). The number of elements in the array is equal to the number of fields that will be written to the output. This array’s size is determined by the number of args passed on the command line after the -c flag. Each entry will contain the column number index to be output, and the location of the column numbers to be printed in the array from index 0, is the order the columns to be written. Let’s assume there were three output column arguments: 3,1,9.

Step 2: Processing  C语言作业辅导

Processing 1 :

Read one line from stdin using getline(). Set the pointer in the input array to the start of the input buffer. Walk the input buffer using pointers only, looking for a ‘,’ or a ‘\n’, and making sure to stop at ‘\0’. For each ‘,’ or the final newline ‘\n’ after the last record, replace the ‘,’ or ‘\n’ with a ‘\0’ to terminate that token. Store the pointer to the next char in the next input array entry. Repeat this process until you either reach the end of the input line or fill all array entries. You should not need to zero out the array on each pass, you just keep track of which element pointer is being processed. At the end of processing the input line, you

should have an array filled with pointers to the start of each field (column) in the record and each field is a properly ‘\0’ terminated string.

Using the CDR example again, say that we start with this CSV input buffer after it was filled by getline(). The red highlight of newline and the null are really just two chars (shown with the

\escape as an illustration)

After processing the CSV record buffer (the one filled by getline()), we have the following data structure alignment between the input array and the CSV buffer.

You can see each ‘,’ in the CSV record buffer, as well as the ‘\n’ at the end, has been replaced by a NULL ‘\0’. Each entry in the input array of pointer to chars points at the start of a field in the record. Array[0] points at field 1, etc. All the entries in the array are updated each time a new record is processed.

Processing 2  C语言作业辅导

 Once you have your array of pointers to fields filled for one record, you will need to write the output. Walk down the output array from index[0]. Each entry in the output array contains the index number of the next field to write to standard output. Use this number as an index into the first array to get the pointer to the input field you need to output (watch your indexes, they start at 0 and columns are numbered from 1). Remember your output must also be a correct CSV format.

Using the output buffer as above, we would write the fields to the output by doing a printf directly from the input buf of pointers. We have 3 records in our output that select buf[2], buf[0], and buf[8] pointers from the input array, after mapping fields numbers to array indexes (subtracting 1). Notice we do not have to copy any strings, use strlen() or allocate more space beyond the two arrays at program start.

Processing 3:

 Loop to Processing 1 and repeat for the next record. When the end-of-file character EOF is reached, return properly with EXIT_SUCCESS.

Here is a warm up exercise for output array referencing the input array. Understand getline() and implement a similar loop in the code to a file titled

converter.c.

Note: This snippet of code doesn’t give any idea of actual implementation details. This is just a reference of how getline() works. For more details, Please visit man 3 getline.

To compile and run your code: C语言作业辅导

  • putthe code in a file (e.g. c).
  • runmake (alternatively you can run gcc c) to create an executable.
  • callthe executable with ./[name of executable].

For each function, other than main, you should have the function prototype in a file titled converter.h. Be sure to have #include “converter.h” at the top of your converter.c file.

Once you can do that, you are ready to work on the assignment. Modify your program to create a program that reads CSV data from standard input and writes a CSV file to standard output.

You will receive 8 points for the primary converter program.

Part 2:

Some CSV files want to allow the fields to contain commas. Each field can optionally be enclosed in quotes. [2 points]

Given an input CSV file of four records could look like 1,2,”test,string”,4

When used as

%./cnvtr -c 4 3 4

Output is “test,string”,4

Note: You should only choose from the following library functions: printf(), fprintf(), getopt(), free(), atoi(), getline() and exit() in your code.

Note: function prototypes should be written in converter.h

Note: There will not be any inputs with unmatched quotes. For e.g.: 1,2,”test,string,4 is an illegal input. A quote will never be part of a string (e.g., there is no escape character).

Style and Commenting  C语言作业辅导

No points are explicitly given for style but teaching staff won’t be able to provide assistance or regrades unless code is readable. Please take a look at the following Style Guidelines

Submission and Grading

  1. Submityour files to Gradescope under the assignment titled  You will submit the following files:

converter.c converter.h README

You can upload multiple files to Gradescope by holding CTRL while you are clicking the files. You can also hold SHIFT to select all files between a start point and an endpoint.

Alternatively, you can place all files in a folder and upload the folder to the assignment. Gradescope will upload all files in the folder.

  1. Aftersubmitting, the autograder will run a few tests:
  2. Checksthat all required files were submitted
  3. Checksthat c compiles
  4. Runssome tests on c

Make sure to check the autograder output after submitting! We will be running additional tests after the deadline passes to determine your final grade.

The assignment will be graded out of 10 points, with 8 points allocated to the main Converter program (part 1), and 2 points allocated to CSV with commas (part 2). Make sure your assignment compiles correctly through the provided Makefile on the ieng6 machines. Any assignment that does not compile will receive 0 credit for parts 1 & 2.

 

更多代写:程序代写 化学代考 project代写 essay代写  网络安全 Python程序

合作平台:天才代写 幽灵代  写手招聘  paper代写

发表回复