JD:Week 10:Discussion 1

 

Learning Resources

Readings

Media

  • Web Video:Kirk, M. (Writer & Director). (2005, October 18). The torture question [Television series episode]. In M. Kirk & J. Gilmore (Producers), Frontline. Boston: WGBH Educational Foundation. Retrieved from http://www.pbs.org/wgbh/pages/frontline/torture/
    • Chapter 2, “The Afghanistan War Prisoners”
    • Chapter 6, “Abu Ghraib – And Beyond” 

Discussion 1 – Week 10COLLAPSETerrorism and the Law
International laws are a group of agreements, or treaties, between nation-states, based on a specific set of values and standards. They may address many different issues, including foreign affairs and trade, criminal conduct, and terrorism. In some cases, international laws are ambiguous at best, and require a great deal of interpretation to decipher intents and meanings. This is true of laws outlined in the Fourth Geneva Convention, one treaty you read about this week. Among other things, the laws in the Fourth Geneva Convention address procedures for administrative detainment of individuals. The laws do not contain specific language that bans countries from infringing on the personal liberties of detainees for the sake of national security. As a result, some nation-states, including the U.S., have been more liberal in their interpretation of the laws, which ultimately influences their policies and procedures for the detainment of terrorists. In this Discussion, you explore more examples of ambiguities in international law and consider the impact on U.S. polices and procedures regarding the detainment, designation, and treatment of terrorists.

To prepare for this Discussion:

  • Review the article “Administrative Detention in Armed Conflict” and the online article “International Law and the Nation-State at the U.N.: A Guide for U.S. Policymakers.” Pay particular attention to examples in which international law regarding terrorism is ambiguous and consider the impact on U.S. terrorism policies and procedures.
  • Select two specific examples in which international law regarding terrorism is ambiguous to use for this assignment.
  • Think about how each of the examples you selected did or might influence U.S. policies and procedures regarding the detainment, designation, and/or treatment of terrorists.

With these thoughts in mind:

Post by Day 3 two examples in which international law regarding terrorism is ambiguous. Then explain the implications of each example on U.S. policies and procedures regarding the detainment, designation, and/or treatment of terrorists. Be specific. 

Video Reflection 2 [WLOs: 2, 3, 5] [CLOs:1, 2, 3, 4]

 

Video Reflection 2 [WLOs: 2, 3, 5] [CLOs:1, 2, 3, 4]

In the past two weeks, you have chosen a publicly traded company and have prepared Section 3 of the Week 5 final project. Section 3 evaluated the stock price of the company using the constant growth formula. This week, you practiced NPV calculations in the discussion, and you are working on Section 4 of the Final Project. In Section 4, you will use the capital asset pricing model (CAPM) to calculate the company’s required rate of return. Then, using this CAPM required rate of return, you will recalculate the company’s stock price using the constant growth formula.

At this point, you have begun to develop an understanding of the value of the company’s stock. Ultimately, you will need to decide if you can recommend investing in this company’s stock (with a buy recommendation) or if you do not feel it is a good investment (a sell recommendation). For many companies, the evidence will be quite strong in one direction or the other. For other companies, the evidence will be conflicting, and you may consider issuing a hold recommendation.

Prepare:

Prior to beginning work on this journal,

Record:

Record a two- to three-minute video answering the following questions:

  • What are the similarities between the time value of money formulas (from Week 3) and the NPV analysis in the Week 4 discussion?
  • What is the purpose of NPV analysis? Be sure to discuss the concepts of risk and return in your answer.
  • What are two improvements or corrections you could make to your previous assignments in preparation for the final project that is due in Week 5?
  • What is going well and what are you struggling with in regard to Weeks 3 and 4?
  • What is one question you have about the Week 4 assignment or the Week 5 final project?

Discussion on Continuity of Care

 please response for below post ne page with intext citation and reference

Step 1: View the following video: Why care continuity means never discharging patients:https://www.youtube.com/watch?v=mx08kmDxgmM (Links to an external site.)

Step 2: Review the Continuity and Coordination of Care by the World Health Organization: https://apps.who.int/iris/bitstream/handle/10665/274628/9789241514033-eng.pdf?ua=1 (Links to an external site.)

Step 3: Read the following article – How important is continuity of care. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2083711/pdf/381.pdf (Links to an external site.)

Step 4: View the following video: Communicating with Doctors as a New Nurse or Nursing Student Tips:https://www.youtube.com/watch?v=Jr4gfgZzMjc (Links to an external site.)

Continuity of care is defined as the sound, timely, smooth, unfragmented and seamless transition of a client from one area within the same healthcare facility, from one level of care to a higher and more intense level of care or to a less intense level of care based on the client’s status and level of acuity, from one healthcare facility to another healthcare facility and also any discharges to the home in the client’s community.

Maintaining the continuity of care requires that the nurse, and other members of the healthcare team, identify current client needs and then move the client to the appropriate clinical area, to the appropriate level of care, and to the appropriate healthcare facility in a timely and effective manner.

The Nurse’s role in Continuity of Care:

  • Coordinate care with the interprofessional team
  • Act as a liaison and be a client advocate
  • Complete admission, transfer, discharge, and post-discharge prescriptions
  • Initiate, revise, and evaluate the plan of care
  • Report the client’s status
  • Coordinate discharge planning
  • Facilitate referrals and use of community resources

Communication, collaboration and cooperation among and between appropriate healthcare team members and the client are essential components of the continuity of care.

Initial Discussion Assignment:

Part 1: Create a detailed scenario in which a patient with frequent re-admissions for Chronic Obstructive Pulmonary Disease (COPD) is successfully integrated into a care plan that exhibits continuity of care from admission to successful discharge to home with follow-up visits and monitoring. You are not allowed to simply agree or disagree in your responses to the initial postings. All responses must have evidence-based reasons why you agree or disagree and must be supported by research. At least one evidence-based article must be referenced.

Part 2: Your patient has transitioned home; however, during a follow-up visit you notice that the patient is having a mild exacerbation and you need to notify the patients doctor for possible admission. Create a detailed ISBAR report prior to calling the physician.

Week 1 Assignment CI

 

Questions

  1. Do you have an idea for a project? Is yes, what is it? If no, what area of your specialty interests you the most? “Avoiding failure in clinical trials”
  2. Is your project idea(s) compatible with your organization’s current mission and purpose? In what way? If no, can the idea be revised to better fit the organization’s current mission and purpose?
  3. Is your project unique? Are there other organizations or groups doing this same work? Is there a way to revise the project idea to make it more unique?
  4. What need does your idea address? (The answer to this question will become the framework for the proposal’s need statement).
  5. How would your idea improve the situation? (This answer will become the basis of the proposal’s goals and objectives).
  6. What do you plan to do to improve the situation? (This answer will become the basis of the proposal methodology).
  7. How will you know if your idea worked? (This answer will become the basis of the evaluation section of the proposal).
  8. How much will your idea cost? (This answer will become the basis of the budget).
  9. Is there support within the organization for the project? Consider if the idea might need internal support of organization leadership/administration or perhaps external support from community leaders, school board, or church leaders.
  10. Is there a way in which these services might be funded in the future once the grant money runs out? (Think sustainability).

Ethical Issues Presentation

 

Research Paper #1*

Exercise Content. Essay and PowerPoint

  1. Ethical Issue PowerPoint Presentation
    Topic:  Elective Abortion.

     The ethical issues presentation will address an ethical issue associated with the practice of nursing. The issue selected for discussion should have clearly identifiable pros and cons that, when analyzed, will allow the student to form a defensible position related to the issue. Principles from identified codes of ethics should be examined in relation to the issue and position. The PowerPoint presentation should have from 12 to15 slides without counting reference and title pages
    The student should address the following:
    1. Define the scope of the ethical issue.
    2. Examine the scope of the issue as it relates to nursing and principles identified in codes of ethics.
    3. Identify at least 2 positions taken on this issue by scholarly experts in the ethics discipline.
    4. Explore the future for the issue as it relates to nursing practice.

    Grading Criteria for the Ethical Issue Essay:
     Definition and scope of the ethical issue                                 20%
    Scope of the issue related to the nursing profession        20%
    Positions on the issue by scholarly experts                              20%
    Exploration of the future for the issue related to
    healthcare and nursing practice                          20%
    Organization and presentation skills                                      20%
    Total                                        100%

     

COACHING AND PERFORMANCE MANAGEMENT

Background

The purpose of the Case Assignment is to create a “Live Case” by experiencing the process of coaching.  Because this case is designed around experiential learning, we can go beyond the conceptual knowledge covered in the reading materials to actual skills building.  This requires putting what you are learning into immediate practice.

In this third module, you will be working with your coachee to explore options based on the coachee’s assessment of goals and current reality (as determined in Case 2).  The objective of this session is to get your coachee to commit to specific actions.  Drawing on the background reading for this and the previous modules, you will plan and carry out a coaching session that involves stage O of the GROW model.

There is a comprehensive explanation of the GROW model on the background page for Module 2. Here is a shorter synopsis:

The GROW model: A simple process for coaching and mentoring.  (2014). Mind Tools. Retrieved from www.mindtools.com/pages/article/newLDR_89.htm

The structure of the Live Case (As a reminder, each case involves three separate activities.)

Each module will follow this cycle: Plan, execute, report.

  • Before the coaching session, write up a plan using course readings or additional research as a resource (1-2 pages).
  • Then meet with the coachee, and use your plan as a guide for the session.
  • The bulk of the report is on how it went, including successes and failures.  What would you do differently next time?  (3 to 5 pages). 

Preplanning

Action

Reflection

What are your goals for the session?

What actions do you plan?

How will you know if you are successful?
(1-2 pages)

Meet with coachee (45-50 minutes).

Report on the session.

Provide a narrative descriptive summary of the conversation as it occurred (1 or 2 paragraphs).

How do you feel the session went?

Analyze the process and outcomes of your coaching.

What new knowledge did you gain?

What would you do differently next time?

Assignment instructions                                                                                        

This phase of the coaching process requires brainstorming.  Think you know everything there is to know about brainstorming?  Too often, we overlook some essential basics about processes we think we know well.  

Van Valin, S. (2014). Brainstorming. Leadership Excellence, 31(2), 20-21. Retrieved from ProQuest.

  • Brainstorm as many options as possible that will help your coachee achieve his or her goal.
  • Discuss the options and select the best ones.
  • You may offer your suggestions, but let your coachee do most of the work of generating and evaluating the options.  Remember that the objective is to get the coachee to commit to action, and this means that the coachee must feel “ownership” of the plan.
  • Write up this meeting as indicated in the Keys to the Assignment below.
  • Turn in your 4- to 6-page paper 

Keys to the Assignment

  • After reading the background materials for this module and doing additional research if needed, prepare your pre-coaching plan for a 45-50 minute session:
  • What are your goals for this session? How will you know if you are successful?
  • What skills will you use?
  • How will you go about doing this?
  • What questions will you ask?
  • Conduct your coaching session (45 to 50 minutes). Remember the ultimate goal of the session is to come up with a plan to which the coachee commits.
  • Write up your post-coaching reflection.
    • Report the facts of the coaching session; summarize the plan.
    • What went well and what did not?
    • What did you learn about coaching from this session?
    • What would you do differently next time?

Interview and Leadership Analysis

 

Interview and Leadership Analysis

Overview

While it is still early in your doctoral education process, it is important to start thinking about where you might complete your doctoral project and who could be your preceptor. This assignment provides an opportunity for you to evaluate a potential practicum site and a leader to serve as preceptor for your final doctoral project. Your interviewee should be in a health care leadership position with a graduate degree in a health care-related field. Ideally he or she should have a doctorate, but a master’s degree is acceptable.

This assignment has two parts:

  1. Interview a health care leader who could serve as your preceptor for your final doctoral project.
    • Summarize the interview in a 3–5 page paper.
  2. With guidance from the health care leader you interview, determine a gap, need, or opportunity that will serve as the basis for the remaining assignments in this course. It should align with a strategic priority of the organization or health care system.
    • Keep in mind this could also serve as a topic for your doctoral project. Explore the feasibility and fidelity of this with the leader, including alignment with the capstone process and timeline.

Instructions

The following corresponds to the grading criteria in the scoring guide, so be sure to address each listed point. Consider reviewing the performance-level descriptions for each criterion to see how your work will be assessed.

  1. Evaluate the primary leadership style of a chosen leader in a health care management position.
    • What are the leader’s credentials and what is his or her formal position in the organization?
    • What role does the leader play in the organization, system, or public health arena?
      • Is he or she a mentor to others?
      • How visible is the leader outside the organization?
      • How would you characterize his or her leadership style?
      • What is his or her role in communication?
      • Is he or she viewed as a change agent?
  2. Assess a leader’s organizational role as it relates to quality, safety, and evidence-based standards.
    • How is the leader’s role interdisciplinary?
    • What is one example of how this leader facilitates, participates, or fosters interprofessional or interdisciplinary collaboration?
    • What is an example of how this leader champions quality and safety in the organization or public health system?
  3. Explain the rationale behind the selection of a leader to serve as a preceptor.
    • Describe why this leader is qualified to be a doctoral preceptor.
      • Is this person trusted and respected in the organization and community?
    • Identify the connections he or she has to formal and informal power and how might these be leveraged to assist you in a quality improvement project or evidence-based practice change.
      • For example, could you get help with navigating policy, recruiting team members, enlisting buy-in?
  4. Identify a gap, problem, or opportunity for a capstone project.
    • Can this be connected to an organizational, systemic, or public health strategic priority?
    • What are the implications for patient outcomes and safety?
    • In the context of the health care system identified, how will you include the voice of the patient?
  5. Summarize a leader’s ability to provide ethical stewardship and oversight when accessing sensitive organizational information.
    • What is within the leader’s capacity to protect?
    • What challenges has the leader encountered with regard to the ethical use of the organization’s private data?
    • How does a leader provide oversight with communications related to organizational information?
  6. Convey purpose, in an appropriate tone and style, incorporating supporting evidence and adhering to organizational, professional, and scholarly writing standards.

Additional Requirements

  • Length of paper: 3–5 pages of content plus title and reference pages.
  • Resources: At least 3 sources including personal communications when appropriate, published within the last 5 years.
  • APA format: Cite your sources using current APA format.
  • Font and font size: Times Roman, 12 points.

Staffing and planning

 

Assignment 1: Staffing Plan for a Growing Business
Due Week 6

Please choose from one (1) of the scenarios below. Note: The scenario that you choose in this assignment will be the one (1) with which you continue for Assignment 2.

Scenario 1

You are a Human Resources Manager of an expanding technology company consisting of 170 employees that develops and distributes small electronic devices. Over the past two (2) years, a research group formed, designed, and built prototypes of small remote surveillance cameras used for security. Recently, your company won a contract to build and provide these remote surveillance cameras to various government agencies. The contract will begin with your company supplying these cameras to agencies within your home state. If all orders are fulfilled sufficiently, the contract will be expanded to supplying agencies outside of your home state.

For the immediate future, you will need to secure a larger facility and hire more staff to sustain the first part of the contract. This staff will consist of ten (10) Assembly Technicians, one (1) Certified Quality Control Engineer, one (1) Contract Administrator, and one (1) Office Support Paraprofessional. Meanwhile, there is a contract clause requiring that you provide a staffing plan in order to ensure future product deliveries and sustain the possible future growth.    

Scenario 2

You are a former certified education administrator who departed your former position to become the owner of a small, in-home day care consisting of you and a part-time assistant where you care for children from age three (3) to age ten (10). Over the course of time, your demographic population has increased due to significant business growth that has resulted in many families relocating to your area. With more businesses projected to move to the area and the building of new housing developments, it is projected that this growth could be long term.

You have decided that this is a good opportunity to expand your day care business as you have received many inquiries for childcare. In order to comply with your home state regulations, you will require a larger facility and will need to hire additional staff in order to sustain the larger demand for day care. This staff will consist of five (5) Certified Day Care Professionals, one (1) Registered Nurse Professional, five (5) After-School Assistants and one (1) Office Support Paraprofessional. You have secured approval for a bank loan and qualify for future loans for future expansion if your current endeavor is successful. Meanwhile, the state in which you operate has requested that you provide a staffing plan before it will issue licensure for your expanded capacity.

Note: You may create and / or make all necessary assumptions needed for the completion of these assignments.

Select one (1) of the scenarios and write a four to five (4-5) page paper in which you:

  1. Identify two (2) types of staffing models that could apply to your chosen scenario and determine which model would be best suited for efficiency, productivity, and possible future growth. Examine the significant effect of each identified staffing model on processes that may be occurring within the organization (e.g., outsourcing, contingent workers, consulting firms, etc.).
  2. Predict the major potential legal issues that you may encounter when establishing equal employment opportunities and diversity within the workplace while still aiming to acquire employees with the needed certifications and credentials. Next, explain the method of achieving transparency within your staffing model. Justify your response.
  3. Specify three (3) tasks that you need to perform to identify, analyze, and develop job requirements and task statements that you will include in formalized job descriptions. Next, predict the frequency with which you would need to review and adjust these job descriptions as your company progresses. Provide a rationale for your response.
  4. Describe three (3) methods to deal with high employee turnover and the availability of employees with required knowledge, skills, or abilities. Next, describe the primary manner in which the described succession-planning methods would be beneficial to your company. Justify your response.
  5. Use at least three (3) quality resources in this assignment. Note: Wikipedia and similar Websites do not qualify as quality resources.

Your assignment must follow these formatting requirements:

  • Be typed, double spaced, using Times New Roman font (size 12), with one-inch margins on all sides; citations and references must follow APA or school-specific format. Check with your professor for any additional instructions.
  • Include a cover page containing the title of the assignment, the student’s name, the professor’s name, the course title, and the date. The cover page and the reference page are not included in the required assignment page length.

The specific course learning outcomes associated with this assignment are:

  • Explain the role of staffing to support an organization’s strategy and improve productivity.
  • Develop a model for staffing an organization that supports the firm’s Human Resources Management strategy and sustains productive operations.
  • Summarize the key legal compliance issues associated with staffing organizations.
  • Explain the planning considerations for staffing organizations, the use of job analysis, and the components of a staffing plan.
  • Use technology and information resources to research issues in staffing organizations.
  • Write clearly and concisely about staffing organizations using proper writing mechanics.

Click here to view the grading rubric.

Capstone – Discussion: Peer Support Activity responses

In response to your peers, offer any advice and feedback to address your peers’ progress and concerns.

Post # 1

Jennifer Wine 

For my project, I chose the topic; how is telemedicine increasing the efficiency in primary care for elderly men with chronic disease in rural areas? The resources needed for this program will be a telehealth program, technology, staffing and training for staff, financial, and partnerships. We will utilize our current staff; however, we will need to hire, externally or internally, a telehealth coordinator. 

The goal for the program is for telehealth to be an integral method of providing health care services to elderly men with chronic diseases in rural areas so they can stay put at home but still receive consistent quality care. Since the technology aspect for the initial startup of telehealth can be expensive and costs associated with it as well, research on grants will be conducted. Many grants have been made available to help with the purchase of such technology if you are able to secure one.

I do not have any concerns at this point, I am able to pull a lot of information from scholarly articles thus far.

Thank you,

Jennifer

Post #2 

Barbara Brash 

Through research conducted in this course I propose to answer the question, how does clinical documentation improvement impact risk, quality, and reimbursement in the hospital setting? The importance of accurate and complete documentation is drilled into every clinical discipline from day one, and never ends. That focus tends to be on risk. However, as the industry continues to move away from fee-for-service and the rate of full-scale audits rapidly increases, the impact of documentation on reimbursement has become a stark reality. Therefore, the return on investment for clinical documentation improvement/integrity (CDI) programs is much more easily recognized, making it less challenging for program managers to garner support from stakeholders (Haas, 2013). 

The human resources required for a successful CDI program include a program manager, clinical documentation specialists (CDS), coders, health information management (HIM) professionals, physician advisor (PA), business/data analysts (BA), and information technology (IT) specialists. Financial resources for continued education and training are imperative, as industry accepted diagnosis and procedure codes, standard definitions, and correlating documentation requirements are revised routinely. The most vital, and complex, resource required for CDI program effectiveness and sustainability is the software platform. This is especially significant to my research as we just completed a full scale, organization wide software transition….yes, we did this during a pandemic. This  has drastically impacted the accuracy and timeliness of our clinical documentation. As we work through the problems, provide ongoing support internally and to providers, and make corrections associated with this transition, we are much less productive in our standard daily work. Our billing has also been delayed, which has resulted in less cash on hand than projected in the budget. As my organization struggles with the effects of the pandemic and the new software platform, my major concern is appropriately problem solving, educating, creating new workflows, and efficiently releasing bills with less resources. Reviewing current roles and associated activities, and collaborating with program managers/participants in other areas to optimize our combined resources and achieve performance goals is the priority.

References

Haas, D. (2013, February). Clinical documentation improvement: what executives need to know and the financial impact of neglect. Becker’s Healthcare.https://www.beckershospitalreview.com/finance/clinical-documentation-improvement-what-executives-need-to-know-and-the-financial-impact-of-neglect.html

Creating a search engine in Python language

 

Page 1/8

Goal: Implement a complete search engine. Milestones Overview

Milestone Goal #1 Produce an initial index for the corpus and a basic retrieval component

#2 Complete Search System

Page 2/8

PROJECT: SEARCH ENGINE Corpus: all ICS web pages We will provide you with the crawled data as a zip file (webpages_raw.zip). This contains the downloaded content of the ICS web pages that were crawled by a previous quarter. You are expected to build your search engine index off of this data. Main challenges: Full HTML parsing, File/DB handling, handling user input (either using command line or desktop GUI application or web interface) COMPONENT 1 – INDEX: Create an inverted index for all the corpus given to you. You can either use a database to store your index (MongoDB, Redis, memcached are some examples) or you can store the index in a file. You are free to choose an approach here. The index should store more than just a simple list of documents where the token occurs. At the very least, your index should store the TF-IDF of every term/document. Sample Index:

Note: This is a simplistic example provided for your understanding. Please do not consider this as the expected index format. A good inverted index will store more information than this. Index Structure: token – docId1, tf-idf1 ; docId2, tf-idf2

Example: informatics – doc_1, 5 ; doc_2, 10 ; doc_3, 7 You are encouraged to come up with heuristics that make sense and will help in retrieving relevant search results. For e.g. – words in bold and in heading (h1, h2, h3) could be treated as more important than the other words. These are useful metadata that could be added to your inverted index data. Optional (1 point for each meta data item up to 2 points max):: Extra credit will be given for ideas that improve the quality of the retrieval, so you may add more metadata to your index, if you think it will help improve the quality of the retrieval. For this, instead of storing a simple TF-IDF count for every page, you can store more information related to the page (e.g. position of the words in the page). To store this information, you need to design your index in such a way that it can store and retrieve all this metadata efficiently. Your index lookup during search should not be horribly slow, so pay attention to the structure of your index COMPONENT 2 – SEARCH AND RETRIEVE: Your program should prompt the user for a query. This doesn’t need to be a Web interface, it can be a console prompt. At the time of the query, your program will look up your index, perform some calculations (see ranking below) and give out the ranked list of pages that are relevant for the query.  

COMPONENT 3 – RANKING:

At the very least, your ranking formula should include tf-idf scoring, but you should feel free to add additional components to this formula if you think they improve the retrieval. Optional (1 point for each parameter up to 2 points max): Extra credit will be given if your ranking formula includes parameters to improve ranking other than tf-idf from the techniques discussed in class.

Milestone #1 (15 points) Goal: Build an index and a basic retrieval component By basic retrieval component; we mean that at this point you just need to be able to query your index for links (The query can be as simple as single word at this point). These links do not need to be accurate/ranked. We will cover ranking in the next milestone. At least the following queries should be used to test your retrieval: 1 – Informatics 2 – Mondego 3 – Irvine 4 – artificial intelligence 5 – computer science Note: query 4 and 5 are for milestone #2 Deliverables: Submit a report (pdf) in Canvas with the following content:

1. A table with assorted numbers pertaining to your index. It should have, at least the number of documents, the number of [unique] words, and the total size (in KB) of your index on disk.

2. Number of URLs retrieved for each of the queries above and listing the first 20 URLS for each query

Evaluation criteria:

● Was the report submitted on time?

 ● Are the reported numbers plausible?

 ● Are the reported URLs plausible?

Milestone #2 (45 points and 6 pts Goal: complete search engine Deliverables:

● Submit a zip file containing all the artifacts/programs you wrote for your search ● A live demonstration of your search engine

Evaluation criteria:

– Does your program work as expected of search engines? – How general are the heuristics that you employed to improve the retrieval? – How complete is the UI? (e.g. links to the actual pages, snippets, etc.)

Page 4/8

– Do you demonstrate in-depth knowledge of how your search engine works? Are you able to answer detailed questions pertaining to any aspect of its implementation?

Additional Information: Understanding the data dump: In Assignment-2, crawlers of all the groups collectively crawled 37,497 URLs. We collected these URLS and are providing them to you as ‘webpages_clean.zip’ file. This zip file contains the following:

1. bookkeeping.json 2. bookkeeping.tsv 3. Folders 0 to 74

Folders: The 37,497 URLs are organized into 75 folders, each folder having 500 files. Every file has the extracted HTML source code of a particular URL. Bookeeping files: bookkeeping.json and bookkeeping.tsv are two different formats of the same file. These files maintain a list of all the URLs that have been crawled. Every URL has an identifier associated with it. This identifier helps locate the HTML code of the URL. The identifier is of the format: “folder_number/file_number” For example, consider the entry on line 13 of bookkeeping.json:

“0/108”: “vision.ics.uci.edu/papers/RamananBK_ICCV_2007”

This means that the HTML code extracted for the link “vision.ics.uci.edu/papers/Ramanan BK_ICCV_2007” is located at folder 0, file number 108. Broken HTML: The HTML source code of the URLs may not be well formed. This means that the code may not necessarily have a pair of opening and closing tags. For example, there might be an open <strong> tag but the associated closing tag </strong> might be missing. The HTML parsers that you will use to parse the documents should be able to handle broken HTML. Hence, as mentioned above, while selecting the parser for your project, please ensure that it can handle broken HTML. Use of libraries: It is strictly not allowed to use libraries that perform the entire task of index creation or ranking for you. Hence, libraries such as Lucene or Elastic Search are not allowed. You may use libraries that help you achieve specific tasks. For example, you can use a tokenizer such as NLTK to tokenize your content. The HTML files

• A Zip file that contains crawl-able HTML files which you may parse/process for extracting tokens. • The HTML files have been organized and stored in numbered directories. The file names are

numbers as well. • The bookkeeping.json and bookkeeping.tsv files represent the index of all the HTML files.

Page 5/8

• The key value of the json file is essentially the relative file path of the HTML content. The value is the web URL of the HTML content.

• Do not confuse bookkeeping with the inverted index. It simply provides you a means to access the crawl-able HTMLs programmatically. The key values in

• bookkeeping can also be used to uniquely identify the files. This will be useful when you need to retrieve the web page and the content while displaying your search engine results.

Building the inverted index • Now that you have been provided the HTML files to index. You may build your inverted

index off of them. • As most of you may already know, the inverted index is simply a map with the token as a key

and a list of its corresponding postings. • A posting is nothing but the representation of the token’s occurrence in a document. • The posting would typically (not limited to) contain the following info (you are encouraged

to think of other attributes that you could add to the index) : • The document name/id the token was found in. • The word frequency. • Indices of occurrence within the document • Tf-idf score etc

Inverted Index

• When designing your inverted index, you will think about the structure of your posting first. • You would normally begin by implementing the code to calculate/fetch the elements which

will constitute your posting. • Modularize. For eg:- If you’re using python, use scripts that will perform a function or a set

of closely related functions. This helps in keeping track of your progress, debugging, and also dividing work amongst teammates if you’re in a group.

• You are free to choose any database system to store your inverted index. • Some possible options – Redis, MongoDB, memcached, MySQL etc. • Pro-tip : If you have a hard time choosing between the database systems. Read about their

performance and learning curves of the libraries available with the language of your choice.

Search and Retrieve • Once you have built the inverted index, you are ready to test document retrieval with

queries. • At the very least, the documents retrieved should be returned based on tf-idf scoring. This

can be done using • the cosine similarity method. Feel free to use a library to compute cosine similarity once you

have the term frequencies and inverse document frequencies. • You may add other weighting/scoring mechanisms to help refine the search results.