Excel Help.
Moderator: Thanas
Excel Help.
I have two spreadsheets, both of which are listings of bars and clubs and the like. One of them is the listing for the entire Los Angeles metropolitan area (e.g. Burbank, Compton, Long Beach, etc), while the other is just the listing for the city of Los Angeles as a municipal entity (i.e. Downtown Los Angeles).
Basically, what I need to do, is find out which of the entries from the City of LA spreadsheet are already in the Greater LA one, as I know there are some missing (the Greater LA .xls is organized by city; it has 3,080 entries for LA, and the City of LA spreadsheet only has 2,868 entries). Those entries then need to be deleted, en-masse is possible.
Is there any way to do this? Was the previous, convoluted paragraph even intelligible? =P Obviously, there's always the Mark I Eyeball method of manually going back and forth from spreadsheet to spreadsheet, Ctrl+Fing my way through and deleting as I go, but if there's a macro/script way to go about this, I would be eternally indebted to whichever of ye sagely gurus is the keeper of such eldritch information.
Basically, what I need to do, is find out which of the entries from the City of LA spreadsheet are already in the Greater LA one, as I know there are some missing (the Greater LA .xls is organized by city; it has 3,080 entries for LA, and the City of LA spreadsheet only has 2,868 entries). Those entries then need to be deleted, en-masse is possible.
Is there any way to do this? Was the previous, convoluted paragraph even intelligible? =P Obviously, there's always the Mark I Eyeball method of manually going back and forth from spreadsheet to spreadsheet, Ctrl+Fing my way through and deleting as I go, but if there's a macro/script way to go about this, I would be eternally indebted to whichever of ye sagely gurus is the keeper of such eldritch information.
Rome is an eternal thought in the mind of God... If there were no Rome, I'd dream of her.
--Marcus Licinius Crassus, Spartacus.
--Marcus Licinius Crassus, Spartacus.
- General Zod
- Never Shuts Up
- Posts: 29211
- Joined: 2003-11-18 03:08pm
- Location: The Clearance Rack
- Contact:
I don't suppose you could post screenshots? There might be a vlookup & if statement formula combination that can be used but it's hard to give an exact formula without seeing the sheet itself. As long as everything follows some sort of logical structure it should be relatively simple.
"It's you Americans. There's something about nipples you hate. If this were Germany, we'd be romping around naked on the stage here."
If you know that the rows for duplicates will be identical in each spreadsheet, you could merge the two together and then do something like this to clear it out.
Also, if you're dealing with this much data, I strongly suggest you learn how to use a database. Excel is not very good for this kind of work.
Also, if you're dealing with this much data, I strongly suggest you learn how to use a database. Excel is not very good for this kind of work.
- General Zod
- Never Shuts Up
- Posts: 29211
- Joined: 2003-11-18 03:08pm
- Location: The Clearance Rack
- Contact:
If the data's fairly uniform, it's a matter of inserting a single column and performing a vlookup off one of the cells, then copying the formula down. Excel is only really a bitch when it comes to using data that's jumbled and not very orderly.phongn wrote:If you know that the rows for duplicates will be identical in each spreadsheet, you could merge the two together and then do something like this to clear it out.
Last edited by General Zod on 2008-06-04 11:13pm, edited 3 times in total.
"It's you Americans. There's something about nipples you hate. If this were Germany, we'd be romping around naked on the stage here."
Well, there are sorting methods that help there, but using Access probably would've made his life a lot easier.General Zod wrote:If the data's fairly uniform, it's a matter of inserting a single column and performing a vlookup off one of the cells, then copying the formula down. Excel is only really a bitch when it comes to using data that's jumbled and not very orderly.
That was actually the first question I asked my boss when I got this job, if they used any kind of database. Unfortunately, the only DB is the one on the website, and I'm dealing with the raw data, because (apparently) it's uploaded directly from spreadsheet format.phongn wrote:If you know that the rows for duplicates will be identical in each spreadsheet, you could merge the two together and then do something like this to clear it out.
Also, if you're dealing with this much data, I strongly suggest you learn how to use a database. Excel is not very good for this kind of work.
I'll post a few screenshots when I'm at work tomorrow. I'm pretty much a nub with Excel as far as it goes past the basic functions. Fortunately, the data is incredibly uniform, the same tabling format is shared between both sheets. Essentially, the goal is to remove the entries already in the City of Los Angeles (which we've already dealt with, as far as my job goes) spreadsheet from the Greater Los Angeles one, so we can see which ones are left over - those are the ones that we need to work on.
Rome is an eternal thought in the mind of God... If there were no Rome, I'd dream of her.
--Marcus Licinius Crassus, Spartacus.
--Marcus Licinius Crassus, Spartacus.
- General Zod
- Never Shuts Up
- Posts: 29211
- Joined: 2003-11-18 03:08pm
- Location: The Clearance Rack
- Contact:
Actually, there's a much asier method for identifying duplicates. You'll need to insert two columns in front of the unique values you're trying to identify in each tab. In the first column you'll label it LA/City of LA respectively, then merge the sheets into a new tab so the data is aligned. Next you just use this code,phongn wrote:If you know that the rows for duplicates will be identical in each spreadsheet, you could merge the two together and then do something like this to clear it out.
Code: Select all
=if(a2=a1,"DUPLICATE","OK")
"It's you Americans. There's something about nipples you hate. If this were Germany, we'd be romping around naked on the stage here."
- General Zod
- Never Shuts Up
- Posts: 29211
- Joined: 2003-11-18 03:08pm
- Location: The Clearance Rack
- Contact:
Okay. . .you'll definitely want to merge both sheets and sort by address. Although I'd recommend inserting a new column in both tabs (putting the value LA, the other City of LA in each column respectively), to distinguish the two before merging the lists so you can sort them out later and figure out what goes where after identifying duplicates. Then use the formula up above in a new column next to the address column (which seems like the best method for identifying dupes), so you can identify which values are duplicates and sort everything by city/city of la afterwords.
"It's you Americans. There's something about nipples you hate. If this were Germany, we'd be romping around naked on the stage here."
- The Jester
- Padawan Learner
- Posts: 475
- Joined: 2005-05-30 08:34am
- Location: Japan
Step #1 - Copy them both into a new spreadsheet, so you don't screw with your source data.
Step #2 - Insert Column "Source Spreadsheet"
Step #3 - With each spreadsheet, fill that column in with the original name. This will let you resort them back to source later. A quick copy and paste over the required cells will handle this quickly
Step #4 - Sort by Address as the primary, Name as the Secondary
Step #5 - Use this code. Address Column Letters as needed)
=IF(A2 = A1, "Possible Duplicate", "")
Put this in the furthest right column (after the data), for each record
Step #6 - Paste that column as Values. All duplicates are now marked.
Step #7 - Read over the records that are marked as possible duplicates. It's possible you'll have different Business Names in similiar locations.
Remove any 'possible duplicates' markers that don't pan out. Replace ones that do with 'Duplicate'
Step #7 - If you need them seperated with the duplicates removed. Sort the 'Duplicate' column you created in Step #5 + pasted in #6, and delete the duplicates.
Step #8 - If need be; Sort by Spreadsheet, and paste them back.
Step #2 - Insert Column "Source Spreadsheet"
Step #3 - With each spreadsheet, fill that column in with the original name. This will let you resort them back to source later. A quick copy and paste over the required cells will handle this quickly
Step #4 - Sort by Address as the primary, Name as the Secondary
Step #5 - Use this code. Address Column Letters as needed)
=IF(A2 = A1, "Possible Duplicate", "")
Put this in the furthest right column (after the data), for each record
Step #6 - Paste that column as Values. All duplicates are now marked.
Step #7 - Read over the records that are marked as possible duplicates. It's possible you'll have different Business Names in similiar locations.
Remove any 'possible duplicates' markers that don't pan out. Replace ones that do with 'Duplicate'
Step #7 - If you need them seperated with the duplicates removed. Sort the 'Duplicate' column you created in Step #5 + pasted in #6, and delete the duplicates.
Step #8 - If need be; Sort by Spreadsheet, and paste them back.